mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 71008: Implemented transition from DRAINING to DRAINED in master.
Date Mon, 15 Jul 2019 18:01:20 GMT


> On July 15, 2019, 2:14 a.m., Benjamin Bannier wrote:
> > src/master/master.cpp
> > Lines 6255-6260 (patched)
> > <https://reviews.apache.org/r/71008/diff/4/?file=2154545#file2154545line6255>
> >
> >     It seems we only do this check to make sure we can access the config below which
introduces quite some coupling. Is there a reason we couldn't grab the config outside the
lambda and capture it instead (i.e., do we want to support mutable drain configs)? That would
allow us to reduce coupling between `Slave::draining` and `markGone`.

This check is specifically to guard against an interleaving of the `RemoveSlave` and `MarkAgentDrained`
registry operations.  There are a variety of ways to trigger the `RemoveSlave`, one of which
is shutting down the agent (SIGUSR1).

So imagine the following sequence of events:
1) Agent sends the master a `UnregisterSlaveMessage`.
2) Master starts the `RemoveSlave` operation.
3) Final terminal ACK arrives at the master, which causes master to call `checkAndTransitionDrainingAgent`
and `MarkAgentDrained`.
4) `RemoveSlave` completes.  Master clears memory of that agent.
5) `MarkAgentDrained` completes.  Master no longer knows about that agent and hits this LOG
line.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71008/#review216605
-----------------------------------------------------------


On July 11, 2019, 4:01 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71008/
> -----------------------------------------------------------
> 
> (Updated July 11, 2019, 4:01 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Greg Mann, and Vinod Kone.
> 
> 
> Bugs: MESOS-9814
>     https://issues.apache.org/jira/browse/MESOS-9814
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This adds logic in the master to detect when a DRAINING agent can
> be transitioned into a DRAINED state.  When this happens, the new
> state is checkpointed into the registry and, if the agent is to be
> marked "gone", the master will remove the agent.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp b42ebb953e0510e83ec6bd041cbddbeb8f60067c 
>   src/master/master.hpp 23dafe746b6f9b3d70ad7220f54c4d49068b8af8 
>   src/master/master.cpp 5247377c2e7e92b9843dd4c9d28f92ba679ad742 
> 
> 
> Diff: https://reviews.apache.org/r/71008/diff/4/
> 
> 
> Testing
> -------
> 
> TODO: Need to write some unit tests.  I'll want to rebase onto the agent changes so that
there is more detectable stuff in the tests.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message