mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Clemmer <clemmer.alexan...@gmail.com>
Subject Review Request 54909: Added member to agent to avoid spurious re-registrations.
Date Tue, 20 Dec 2016 17:54:58 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
-----------------------------------------------------------

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, and Joseph Wu.


Bugs: MESOS-6803
    https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
-------

Currently when a new master is detected and no credential is provided,
the agent will attempt to (re-)register after some random initial
`delay`, to avoid thundering herds. It is hence possible to have
spurious double-registrations, since a new master could be detected
after we add the `delay`'d registration, but before we execute it.

To resolve this problem, we add a member, `agentRegistrationTimer` to
the agent, and call `Clock::cancel` on it when we successfully register
with the master.


Diffs
-----

  src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79 
  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54909/diff/


Testing
-------

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to catch intermittent
failures, which is how we caught the failing test in `reservation_tests.cpp`. Note that this
bug was discovered when we added a `delay` to the call to `authenticate` in `slave::detected`
(in order to get it to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message