mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kone <vinodk...@gmail.com>
Subject Re: Review Request 54232: Shutdown tasks of completed frameworks on agent re-registration.
Date Tue, 24 Jan 2017 00:40:55 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54232/#review162736
-----------------------------------------------------------




src/master/master.cpp (lines 5591 - 5598)
<https://reviews.apache.org/r/54232/#comment234057>

    can you add this if check inside the `foreach` loop?
    
    ```
      foreach (const Task& task, tasks) {
        const FrameworkID& frameworkId = task.framework_id();
        Framework* framework = getFramework(frameworkId);
     
        // Don't add the task if the framework is shutdown.
        if (!isCompletedFramework(task.framework_id())) {
         continue;
        }
    
        // Always re-add partition-aware tasks.
        if (partitionAwareFrameworks.contains(frameworkId)) {
          tasks_.push_back(task);
    
          if (framework != nullptr) {
            framework->unreachableTasks.erase(task.task_id());
          }
        } else if (!slaveWasRemoved) {
          // Only re-add non-partition-aware tasks if the master has
          // failed over since the agent was marked unreachable.
          tasks_.push_back(task);
        }
      }
    ```


- Vinod Kone


On Jan. 18, 2017, 7:33 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54232/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2017, 7:33 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6602
>     https://issues.apache.org/jira/browse/MESOS-6602
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, if a framework completed (e.g., due to a teardown operation
> or framework shutdown), any framework tasks running on partitioned
> agents would not be shutdown when the agent re-registered. For tasks
> that are not partition-aware, the task would be shutdown on agent
> re-registration anyway. But for partition-aware tasks, this could lead
> to orphan tasks.
> 
> Fix this by changing the master to shutdown such tasks when the agent
> reregisters.
> 
> Note that if the master fails over between the time the framework
> completes and a partitioned agent re-registers, any framework tasks
> running on the agent will NOT be shutdown. This is a known bug; fixing
> it requires persisting the framework shutdown operation to the registry
> (MESOS-1719).
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54232/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message