mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov" <ruklet...@gmail.com>
Subject Re: Review Request 40332: Quota: Implemented recovery in hierarchical allocator.
Date Thu, 03 Dec 2015 19:53:50 GMT


> On Nov. 23, 2015, 5:54 p.m., Joris Van Remoortere wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 413-418
> > <https://reviews.apache.org/r/40332/diff/3/?file=1137413#file1137413line413>
> >
> >     Rather than doing the math here  (which I believe we're missing corresponding
entries for in `removeSlave()`?) why not test whether `slaves.size() >= expectedAgentCount()`
?
> >     
> >     I think it is more clear, and less error prone.
> >     
> >     Let's log when this condition is met?
> 
> Alexander Rukletsov wrote:
>     Let me address your concerns one by one : )
>     
>     1. I don't think we need corresponding entries in `removeSlave()` because we do not
track what agents reregistered, but rather a total number. If an agent reregistered and then
got lost, this should not influence the allocation pause.
>     2. For the reason described in 1., `slaves.size() >= expectedAgentCount()` seems
not a good fit, because it counts removed agents in.
>     3. Logging is a good idea.

If we have just a number for recovered agents, we cannot distinguish between “old” agents
from the registry and “new” ones joined after recovery. Because we do not persist enough
information to base logical decisions on, any accounting algorithm here will be crude. Hence
we decided to pick Joris' suggestion. Doing `slaves.size() >= expectedAgents` at least
expresses the intention.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40332/#review107607
-----------------------------------------------------------


On Nov. 30, 2015, 3:32 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40332/
> -----------------------------------------------------------
> 
> (Updated Nov. 30, 2015, 3:32 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Joerg Schad, Joris Van Remoortere, Joseph Wu,
and Qian Zhang.
> 
> 
> Bugs: MESOS-3981
>     https://issues.apache.org/jira/browse/MESOS-3981
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.hpp 1cd8d16661568010901e74705375e7719cdfb8a0

>   src/master/allocator/mesos/hierarchical.cpp 31ed62efb5b1a2edb567f43d37559c5914e0665e

> 
> Diff: https://reviews.apache.org/r/40332/diff/
> 
> 
> Testing
> -------
> 
> make check (Mac OS X 10.10.4)
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message