mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos ReviewBot <revi...@mesos.apache.org>
Subject Re: Review Request 48744: Changed agent and scheduler authentication timeouts to ensure progress.
Date Thu, 16 Jun 2016 02:44:54 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48744/#review137915
-----------------------------------------------------------



Patch looks great!

Reviews applied: [48744]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose'
ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker_build.sh

- Mesos ReviewBot


On June 15, 2016, 9:20 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48744/
> -----------------------------------------------------------
> 
> (Updated June 15, 2016, 9:20 p.m.)
> 
> 
> Review request for mesos, Adam B and Vinod Kone.
> 
> 
> Bugs: MESOS-2043
>     https://issues.apache.org/jira/browse/MESOS-2043
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The master, agent and scheduler all use the same value for when an
> authentication attempt times out. This can lead to situations where
> attempts time out on the master and e.g., an agent simultaneously.
> 
> If then the agent attempts another authentication while the master has
> not finished properly cleaning up the attempt the master would queue
> the new attempt behind the existing one, and subsequently notify the
> agent that the former attempt timed out. The agent on the other hand
> already timed out that attempt and is waiting for the new one to make
> progress.
> 
> Once the master and e.g., agent have entered this process they will
> likely move in lockstep, and it becomes highly unlikely for the agent
> to successfully authenticate.
> 
> Here we change the timeout used in the agent and scheduler to avoid
> this lockstep behavior. We allow for slightly more time on the
> agent/scheduler side before an attempt times out. We also use a value
> that makes sure that cycles of authentication attempt and timeout have
> very different periods on master and agent/scheduler.
> 
> 
> Diffs
> -----
> 
>   src/sched/sched.cpp 9f561d73a2e591afdc3ba4adb35a11763dced402 
>   src/slave/slave.cpp 0af04d6fe53f92e03905fb7b3bec72b09d5e8e57 
> 
> Diff: https://reviews.apache.org/r/48744/diff/
> 
> 
> Testing
> -------
> 
> Tested on internal CI on a collection of Linux setups.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message