mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Review Request 69980: Modified when master responds to operation status updates.
Date Wed, 13 Feb 2019 23:23:11 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69980/
-----------------------------------------------------------

Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

When dealing with orphaned operation status updates, there are two
cases the master must deal with:
- The simple case is when the master knows the framework is completed.
  These status updates can be acknowledged by the master.
- However, a completed framework can be rotated out of the master's
  memory.  In addition, after master failover, if an agent reregisters
  before the framework, an operation can appear to be orphaned until
  the framework reregisters.

This adds a fixed delay between agent reregistration and when the
master acknowledges operation status updates from unknown frameworks.
The delay should give frameworks ample time to reregister.

The delay is based on agent reregistration in order to mitigate the
delay of acknowledging status updates of frameworks rotated out of
the completed frameworks buffer.


Diffs
-----

  src/master/constants.hpp b0ab9187b8c672180e2ffb8b63cb7349dbe43ac4 
  src/master/master.cpp 014e0e053cdf5c53a5ef8d63300205a121bed319 


Diff: https://reviews.apache.org/r/69980/diff/1/


Testing
-------

TODO: This case needs unit tests.


Thanks,

Joseph Wu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message