mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 72831: Fixed a CHECK failure in master during agent removal.
Date Tue, 08 Sep 2020 23:51:44 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72831/#review221819
-----------------------------------------------------------


Ship it!




Ship It!

- Greg Mann


On Sept. 8, 2020, 11:48 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72831/
> -----------------------------------------------------------
> 
> (Updated Sept. 8, 2020, 11:48 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-9609
>     https://issues.apache.org/jira/browse/MESOS-9609
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Per MESOS-9609, it's possible for the master to encounter a CHECK
> failure during agent removal in the following situation:
> 
>   1. Given a framework with checkpoint == false, with only
>      executor(s) (no tasks) running on an agent:
>   2. When this agent disconects from the master,
>      Master::removeFramework(Slave*, Framework*) removes the
>      tasks and executors. However, when there are no tasks, this
>      function will accidentally insert an entry into
>      Master::Slave::tasks! (Due to the [] operator usage)
>   3. Now if the framework is removed, we have an entry in
>      Slave::tasks, for which there is no corresponding framework.
>   4. When the agent is removed, we have a CHECK failure given
>      we can't find the framework.
> 
> This fixes the issue by avoiding the accidental insertion.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 02723296e569fac9d553b1494a5ca7daa6ef9aa4 
> 
> 
> Diff: https://reviews.apache.org/r/72831/diff/1/
> 
> 
> Testing
> -------
> 
> See subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message