mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meng Zhu <m...@mesosphere.io>
Subject Re: Review Request 66144: Enforced task launch order on the agent.
Date Tue, 03 Apr 2018 17:39:38 GMT


> On April 2, 2018, 4:57 p.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Line 2233 (original), 2274 (patched)
> > <https://reviews.apache.org/r/66144/diff/7/?file=1991208#file1991208line2274>
> >
> >     We should defer this callback.
> 
> Meng Zhu wrote:
>     That would change the old behavior i.e. `sendExitedExecutorMessage` is sent synchronously
along with other error handling code. https://github.com/apache/mesos/blob/594ee20c2453dad836313769aef9f8655cd75cd5/src/slave/slave.cpp#L2226-L2231
>     
>     I found making the error handling asynchronous unnecessarily difficult to reason.
e.g. making it asynchronous means that there is a brief moment that the first task has failed
but the sequence is still alive--contradicting our comments. Tieing the sequence lifecycle
and `exitedExecutorMessage` to task launch success atomically makes the code much easier to
reason.
>     
>     I am not sure if it would make a difference now, but we should stick to the old behavior
unless there is compelling reason not to.

OK, need to defer, as the error handler might be called in a different context other than
the agent actor.


- Meng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66144/#review200329
-----------------------------------------------------------


On April 2, 2018, 5:36 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66144/
> -----------------------------------------------------------
> 
> (Updated April 2, 2018, 5:36 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Greg Mann.
> 
> 
> Bugs: MESOS-8617 and MESOS-8624
>     https://issues.apache.org/jira/browse/MESOS-8617
>     https://issues.apache.org/jira/browse/MESOS-8624
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Up until now, Mesos does not guarantee in-order
> task launch on the agent. There are two asynchronous
> steps (unschedule GC and task authorization) in the
> agent task launch path. Depending on the CPU scheduling
> order, a later task launch may finish these two steps earlier
> than its predecessors and get to the launch executor stage
> earlier, resulting in out-of-order task delivery.
> 
> This patch enforces the task delivery order by sequencing
> task launch after the two asynchronous steps, specifically
> right before entering `__run()`.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 37f0361251524e63d02d251e8a03901812b8affb 
>   src/slave/slave.cpp a4bd4ccd3fc59c3c0e462d9b480f5424b3e52d7a 
> 
> 
> Diff: https://reviews.apache.org/r/66144/diff/8/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message