mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.
Date Fri, 16 Dec 2016 19:44:04 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/#review159484
-----------------------------------------------------------




src/tests/slave_tests.cpp (lines 2733 - 2736)
<https://reviews.apache.org/r/54803/#comment230487>

    It looks to me like
    ```
    Clock::advance(totalTimeout);
    Clock::advance(flags.registration_backoff_factor);
    ```
    here would be sufficient. First we advance by `totalTimeout`, which allows two ping timeout
intervals to elapse, leading to the agent being removed. If we then advance by the backoff
factor, we can be assured that the agent will reregister even if it delays the first registration
attempt. Does that make sense?


- Greg Mann


On Dec. 16, 2016, 7:23 p.m., Alex Clemmer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54803/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2016, 7:23 p.m.)
> 
> 
> Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg Mann, John
Kordich, Joseph Wu, and Vinod Kone.
> 
> 
> Bugs: MESOS-6803
>     https://issues.apache.org/jira/browse/MESOS-6803
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
> use `delay` to schedule a random time in the future to register with the
> Master, to avoid the thundering herd problem after a Master failover.
> The authentication codepath, in contrast, schedules the registration
> immediately.
> 
> In tests where we have `Clock::pause`'d when we are supposed to be
> registering the slave, the authention codepath will succeeed, while
> no-authentication codepath will hang forever.
> 
> A much more detailed analysis of this situation exists in MESOS-6803.
> 
> This commit will resolve this issue for `slave_tests.cpp` by changing
> the tests to not use `Clock::pause` when we are waiting for Agent
> registration.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp fc6b56c074c71b827a9ee522cd715c0d15ecc7e3 
> 
> Diff: https://reviews.apache.org/r/54803/diff/
> 
> 
> Testing
> -------
> 
> Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests to find failing
tests in `SlaveTest.*`, then fixed, then ran again.
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message