mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <>
Subject Re: Review Request 72831: Fixed a CHECK failure in master during agent removal.
Date Tue, 08 Sep 2020 23:48:59 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Sept. 8, 2020, 11:48 p.m.)

Review request for mesos and Greg Mann.

Bugs: MESOS-9609

Repository: mesos


Per MESOS-9609, it's possible for the master to encounter a CHECK
failure during agent removal in the following situation:

  1. Given a framework with checkpoint == false, with only
     executor(s) (no tasks) running on an agent:
  2. When this agent disconects from the master,
     Master::removeFramework(Slave*, Framework*) removes the
     tasks and executors. However, when there are no tasks, this
     function will accidentally insert an entry into
     Master::Slave::tasks! (Due to the [] operator usage)
  3. Now if the framework is removed, we have an entry in
     Slave::tasks, for which there is no corresponding framework.
  4. When the agent is removed, we have a CHECK failure given
     we can't find the framework.

This fixes the issue by avoiding the accidental insertion.


  src/master/master.cpp 02723296e569fac9d553b1494a5ca7daa6ef9aa4 



See subsequent patch.


Benjamin Mahler

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message