mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Conway <>
Subject Re: Review Request 53097: Fixed bug when marking agents unreachable after master failover.
Date Fri, 21 Oct 2016 19:37:21 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Oct. 21, 2016, 7:37 p.m.)

Review request for mesos and Vinod Kone.


Fix passing `TimeInfo` by value.

Bugs: MESOS-6445

Repository: mesos


If the master fails over and an agent does not re-register within the
`agent_reregister_timeout`, the master marks the agent as unreachable in
the registry and sends `slaveLost` for it. However, we neglected to
update the master's in-memory state for the newly unreachable agent;
this meant that task reconciliation would return incorrect results
(until/unless the next master failover).

Diffs (updated)

  src/master/master.hpp 6d2db9de52d35f3288c618d05138413ce709818b 
  src/master/master.cpp 3f3ce93155069dd32731783ac4877ba6ee2519c0 
  src/tests/master_tests.cpp 033fae336d107f16f7764b94117a9396df6cd80e 



`make check`
`./src/mesos-tests --gtest_filter="MasterTest.UnreachableTaskAfterFailover" --gtest_repeat=1000


Neil Conway

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message