mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 44989: Fixed a race in the resource offers tests.
Date Fri, 18 Mar 2016 10:56:03 GMT


> On March 18, 2016, 12:16 a.m., Neil Conway wrote:
> > src/tests/resource_offers_tests.cpp, line 63
> > <https://reviews.apache.org/r/44989/diff/1/?file=1303623#file1303623line63>
> >
> >     Style-wise, do we want all tests to resume if they initially pause it? I think
we do a mix of both.
> 
> Greg Mann wrote:
>     Hmmm good question. I know that I'm responsible for some instances where `resume`
is *not* called at the end of the test, but now that you mention it, it does seem like a good
idea. I've added it here. We should do a sweep and make this consistent across the tests;
I could have a look next week.

Note that if an `ASSERT*` fails a `Clock::resume` at the end of the test would never be called,
so one would expect something like this to be handled elsewhere, something we already do,
see https://github.com/apache/mesos/blob/master/src/tests/mesos.hpp#L984 or https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/tests/main.cpp#L59.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44989/#review124116
-----------------------------------------------------------


On March 18, 2016, 10:48 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44989/
> -----------------------------------------------------------
> 
> (Updated March 18, 2016, 10:48 a.m.)
> 
> 
> Review request for mesos, Adam B and Joerg Schad.
> 
> 
> Bugs: MESOS-4849
>     https://issues.apache.org/jira/browse/MESOS-4849
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Fixed a race in the resource offers tests.
> 
> Adding HTTP credentials to `StartSlave` in 'src/tests/mesos.cpp' has exposed a race condition
in ResourceOffersTest.ResourceOfferWithMultipleSlaves. The test quickly runs `StartSlave`
10 times to create 10 agents. Under the covers, `StartSlave` writes data to disk, and it seems
that with the additional data being written to disk for HTTP credentials, the filesystem operations
for one `StartSlave` call were not completing before the next call.
> 
> By settling the clock in between each invocation of `StartSlave`, this patch fixes the
race. The test is slowed considerably, but it is now reliable.
> 
> 
> Diffs
> -----
> 
>   src/tests/resource_offers_tests.cpp 1cf292ee7931207596f8f06677386bef5965ef15 
> 
> Diff: https://reviews.apache.org/r/44989/diff/
> 
> 
> Testing
> -------
> 
> `GTEST_FILTER="ResourceOffersTest.ResourceOfferWithMultipleSlaves" bin/mesos-tests.sh
--gtest_repeat=1000 --gtest_break_on_failure=1` was used to test on both OSX and Ubuntu 14.04.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message