mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shuai Lin <linshuai2...@gmail.com>
Subject Re: Review Request 43321: Speeded up SchedulerTest.Decline by advancing the clock.
Date Fri, 12 Feb 2016 12:15:46 GMT


> On Feb. 9, 2016, 2:29 p.m., Guangya Liu wrote:
> > src/tests/scheduler_tests.cpp, line 839
> > <https://reviews.apache.org/r/43321/diff/1/?file=1237111#file1237111line839>
> >
> >     Do we need Clock::settle() here to make sure the `recoverResources` messages
to be dispatched and processed completely?
> 
> haosdent huang wrote:
>     +1 for add settle
> 
> Shuai Lin wrote:
>     Hello haosdent and Guangya,
>     
>     I don't think `Clock::settle()` is needed here.
>     
>     I guess your rationale is we need to be sure the decline call is processed *before*
the next around of allocation is executed, which I totally agree. But we already have it without
`clock::settle()`, here is my understanding:
>     
>     - We advance the clock after the dispatch event of `recoverResources` is enqueued.
And by advancing the clock, we can be sure the `HierarchicalAllocatorProcess::batch()` function,
which does the allocation work, being added to the event loop.
>     
>     - Since the `recoverResources` is dispatched before `HierarchicalAllocatorProcess::batch()`,
it would always be processed first by allocator.
>     
>     What do you think?
> 
> Guangya Liu wrote:
>     I think we cannot make sure that the `HierarchicalAllocatorProcess::batch()` is always
handled before `recoverResources` for some race condition cases, you may see that most of
the `advance()` always including `Clock::settle()`
> 
> Shuai Lin wrote:
>     Hello Guangya, Could you please elaborate more on the "race condition cases"? IMHO
libprocess guarantees for a given actor, first enqueued event is also hanlded first, no?
> 
> Guangya Liu wrote:
>     My understanding is that the `decline` will involve two libprocess calls `decline`
and `recoverResources`, there might be problem if `HierarchicalAllocatorProcess::batch()`
is called after `decline` but before `recoverResources`, comments?
> 
> Shuai Lin wrote:
>     Let's inspect the behavior of the allocator actor:
>     
>     - When `AWAIT_READY(recoverResources)` returns, we can be sure that a dispatch event
of `HierarchicalAllocatorProcess::recoverResource` for the allocator actor is already in the
run queue. 
>     - After that we advance the clock so that a dispatch event of `HierarchicalAllocatorProcess::batch()`
is enqueued immediately (instead of waiting for a duration of `flags.allocation_interval`).
>     - Since the `HierarchicalAllocatorProcess::recoverResource` is enquened first, we
can be sure it's called before `HierarchicalAllocatorProcess::batch()` is called.
> 
> Guangya Liu wrote:
>     In a multi-core environment, is it possible for the `HierarchicalAllocatorProcess::recoverResource`
and `HierarchicalAllocatorProcess::batch()` be proceed almost same time?

After a discussion on IRC with @gyliu, we agreed that the race condition is not possible (thus
`Clock::settle()` is not neededed here) because libprocess guarantees only one event is handled
at a time for any actor.

https://github.com/apache/mesos/blob/0.27.0/3rdparty/libprocess/README.md#processes-and-the-asynchronous-pimpl-pattern


- Shuai


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43321/#review118391
-----------------------------------------------------------


On Feb. 12, 2016, 12:15 p.m., Shuai Lin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43321/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2016, 12:15 p.m.)
> 
> 
> Review request for mesos and Alexander Rukletsov.
> 
> 
> Bugs: MESOS-4175
>     https://issues.apache.org/jira/browse/MESOS-4175
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Speeded up SchedulerTest.Decline by advancing the clock.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 37f1709 
> 
> Diff: https://reviews.apache.org/r/43321/diff/
> 
> 
> Testing
> -------
> 
> sudo make check -j2 GTEST_FILTER='ContentType/SchedulerTest.Decline/*
> 
> ```sh
> [ RUN      ] ContentType/SchedulerTest.Decline/0
> [       OK ] ContentType/SchedulerTest.Decline/0 (114 ms)
> [ RUN      ] ContentType/SchedulerTest.Decline/1
> [       OK ] ContentType/SchedulerTest.Decline/1 (98 ms)
> ```
> 
> 
> Thanks,
> 
> Shuai Lin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message