mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Review Request 69961: Handle possible orphaned operations after master/agent failover.
Date Wed, 13 Feb 2019 02:14:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69961/
-----------------------------------------------------------

Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

This is one of two possible code paths which can introduce orphaned
operations.

When a master failover occurs, all agents and frameworks must
reregister with the master.  Agents that reregister will report their
operations with an UpdateSlaveMessage.  Any operations without a
known framework will be considered orphans.  Known frameworks are
discovered when the framework reregisters, or an agent running a task
under the framework reregisters.  The race between these reregistrations
will be addressed in a separate commit.

Agent failover can also introduce orphans, if the agent has a pending
operation during failover, and then is migrated to a separate master
before restarting.  This will be handled the same way as agent
reregistration after a master failover.


Diffs
-----

  src/master/master.cpp 014e0e053cdf5c53a5ef8d63300205a121bed319 


Diff: https://reviews.apache.org/r/69961/diff/1/


Testing
-------

See last patch in chain.


Thanks,

Joseph Wu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message