mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Megha Sharma <mshar...@apple.com>
Subject Re: Review Request 61473: Do not kill non partition aware tasks.
Date Mon, 28 Aug 2017 15:30:44 GMT


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 342 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793873#file1793873line342>
> >
> >     One empty line above.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 3203-3206 (original), 3215-3227 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793873#file1793873line3215>
> >
> >     Should we only loop through the once? And in order to try not to duplicate similar
lines, consider the following?
> >     
> >     ```
> >     foreachvalue (const Owned<Task>& _task, framework->unreachableTasks)
{
> >       Owned<Task> task = _task;
> >     
> >       if (framework->capabilities.partitionAware) {
> >         task = ...
> >       }
> >       
> >       frameworkTaskSummaries[frameworkId].count(*task.get());
> >       slaveTaskSummaries[task->slave_id()].count(*task.get());
> >     }
> >     ```

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 4017-4027 (original), 4038-4063 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793873#file1793873line4038>
> >
> >     Similar to above, don't duplicate lines.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 4167-4175 (original), 4203-4221 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793873#file1793873line4203>
> >
> >     Ditto.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp
> > Lines 2403-2408 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793874#file1793874line2406>
> >
> >     Can we hold off creating a helper for this? IMO this helper is doing too little
and not so much of an "abstraction", i.e., what the method does it not perfectly captured
by the name of the mehod. Inlining 1) making a copy & 2) setting the state is not too
much redudancy than calling this method.

Code changed, N/A anymore


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp
> > Lines 2769-2771 (original), 2773-2775 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793874#file1793874line2776>
> >
> >     Fix the comment.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 6398-6414 (original), 6389-6405 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line6398>
> >
> >     Fix the comments.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 6422 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line6431>
> >
> >     `erase` can handle the `!contains` case.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 6477-6480 (original), 6464-6467 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line6478>
> >
> >     Fix the comment.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Line 7137 (original), 7112 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line7138>
> >
> >     If our code stops making such distinction, I don't think the comment should
continue to make such distinction.

Fixed


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 7132-7137 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line7161>
> >
> >     Do this only when we actually send an update i.e., `framework->connected()`?

Code changed, not applicable any more.


> On Aug. 10, 2017, 9 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 8928-8931 (original), 8907-8910 (patched)
> > <https://reviews.apache.org/r/61473/diff/3/?file=1793875#file1793875line8936>
> >
> >     So for this update event we are going to send TASK_UNREACHABLE instead TASK_LOST,
which is unintended?

Code changed, not applicable any more.


- Megha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61473/#review182605
-----------------------------------------------------------


On Aug. 10, 2017, 4:07 p.m., Megha Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61473/
> -----------------------------------------------------------
> 
> (Updated Aug. 10, 2017, 4:07 p.m.)
> 
> 
> Review request for mesos, Vinod Kone and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7215
>     https://issues.apache.org/jira/browse/MESOS-7215
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Master will not kill the tasks for non-Partition aware frameworks
> when an unreachable agent re-registers with the master.
> Master used to send a ShutdownFrameworkMessages to the agent
> to kill the tasks from non partition aware frameworks including the
> ones that are still registered which was problematic because the offer
> from this agent could still go to the same framework which could then
> launch new tasks. The agent would then receive tasks of the same
> framework and ignore them because it thinks the framework is shutting
> down. The framework is not shutting down of course, so from the master
> and the scheduler’s perspective the task is pending in STAGING forever
> until the next agent reregistration, which could happen much later.
> This commit fixes the problem by not shutting down the non-partition
> aware frameworks on such an agent.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 959091c8ec03b6ac7bcb5d21b04d2f7d5aff7d54 
>   src/master/master.hpp b802fd153a10f6012cea381f153c28cc78cae995 
>   src/master/master.cpp 7f38a5e21884546d4b4c866ca5918db779af8f99 
>   src/tests/partition_tests.cpp 62a84f797201ccd18b71490949e3130d2b9c3668 
> 
> 
> Diff: https://reviews.apache.org/r/61473/diff/3/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Megha Sharma
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message