mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Conway <neil.con...@gmail.com>
Subject Re: Review Request 54889: Handled all possible offers in test.
Date Wed, 21 Dec 2016 15:39:08 GMT


> On Dec. 20, 2016, 4:41 p.m., Neil Conway wrote:
> > Can we add an expectation for exactly when we expect to receive the offer? i.e.,
another `WillOnce(FutureSatisfy(...))` and then wait for the future at the appropriate time.
> > 
> > The thing I don't like about the `WillRepeatedly(...)` pattern is that it obscures
whether additional offers are actually expected or not. Lots of tests have copied this pattern
and use it, even when another offer is not actually expected.
> 
> Benjamin Bannier wrote:
>     This test only uses a single offer, and we do not know when or even if a second offer
would arrive since the test does not run with a paused clock, https://github.com/apache/mesos/blob/1e72605e9892eb4e518442ab9c1fe2a1a1696748/src/tests/fault_tolerance_tests.cpp#L853.
If execution on the thread running in the test body is delayed sufficiently after the clock
is resumed, additional offers might be made concurrently with that part of the test body.
>     
>     It looks to me that as a general rule of thumb, if tests do not pause the clock,
one should always ignore subsequent calls to mock interfaces in order to avoid this gmock
assertion.

Right, although there are plenty of other situations in which _arbitrary_ delays in tests
will cause failures (e.g., exceeding the registry read/write timeouts, or even causing "sleep
60" dummy tasks to exit). So I'm not sure that this problem is unique to making offers.

It would be good to explore pausing the clock by default for all tests (MESOS-4101). But in
the short term, what about making the default allocation interval much higher for tests? Say
60 seconds. Test code should generally not be depending on batch allocations anyway (because
it needlessly slows down the test), so this would make such poorly written tests easier to
find.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/#review159727
-----------------------------------------------------------


On Dec. 20, 2016, 10:15 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54889/
> -----------------------------------------------------------
> 
> (Updated Dec. 20, 2016, 10:15 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Neil Conway.
> 
> 
> Bugs: MESOS-6820
>     https://issues.apache.org/jira/browse/MESOS-6820
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The test FaultToleranceTest.FrameworkReregister observes a single
> offer in order to monitor progress and as a trigger to initiate other
> actions later in the test. The change installs expectations for
> possible additional offers. This allows for the thread running the
> test body to be delayed with respect to the master thread which could
> in the meantime make additional offers.
> 
> 
> Diffs
> -----
> 
>   src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 
> 
> Diff: https://reviews.apache.org/r/54889/diff/
> 
> 
> Testing
> -------
> 
> `make check` (OS X, clang-trunk, w/ optimizations, SSL)
> 
> Before this fix `FaultToleranceTest.FrameworkReregister` failed for me multiple times
in ~10k iterations; after this patch it didn't fail in a single run I stopped after 170k iterations.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message