mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rukletsov <ruklet...@gmail.com>
Subject Re: Review Request 63589: Fixed bug in tests leading to orphaned containers.
Date Thu, 09 Nov 2017 17:39:53 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63589/#review190610
-----------------------------------------------------------




src/tests/slave_recovery_tests.cpp
Line 1167 (original), 1167 (patched)
<https://reviews.apache.org/r/63589/#comment268099>

    Do a `Clock::settle()` please.



src/tests/slave_recovery_tests.cpp
Lines 1190-1192 (original), 1188-1190 (patched)
<https://reviews.apache.org/r/63589/#comment268112>

    It is a good habit to resume the clock before exiting the test. Please do it before stopping
the driver. Here and everywhere where applicable



src/tests/slave_recovery_tests.cpp
Lines 1191-1194 (original), 1189-1192 (patched)
<https://reviews.apache.org/r/63589/#comment268098>

    A good habit is to resume the clock before exiting the test.



src/tests/slave_recovery_tests.cpp
Lines 1277-1279 (original), 1275-1277 (patched)
<https://reviews.apache.org/r/63589/#comment268108>

    Add `Clock::settle` in-between to make sure the message is processed. Here and below where
applies.



src/tests/slave_recovery_tests.cpp
Lines 1303-1305 (original), 1299-1301 (patched)
<https://reviews.apache.org/r/63589/#comment268102>

    And let's resume the clock before stopping the driver.



src/tests/slave_recovery_tests.cpp
Lines 1394-1399 (original), 1388-1393 (patched)
<https://reviews.apache.org/r/63589/#comment268107>

    You need to add `Clock::settle()` to make sure the comment is true.



src/tests/slave_recovery_tests.cpp
Lines 1465-1474 (original), 1454-1470 (patched)
<https://reviews.apache.org/r/63589/#comment268118>

    If I understand correctly, what you need is a confirmation, that the task has launched
and the ack has been acked. To achieve that, you don't really need `statusStarting` and `statusRunning`.



src/tests/slave_recovery_tests.cpp
Lines 1992-1994 (original), 1997-1999 (patched)
<https://reviews.apache.org/r/63589/#comment268119>

    Aaand resume the clock : )



src/tests/slave_recovery_tests.cpp
Lines 2101-2103 (original), 2119-2121 (patched)
<https://reviews.apache.org/r/63589/#comment268120>

    Please resume


- Alexander Rukletsov


On Nov. 6, 2017, 6:41 p.m., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63589/
> -----------------------------------------------------------
> 
> (Updated Nov. 6, 2017, 6:41 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Benno Evers, and Gilbert Song.
> 
> 
> Bugs: MESOS-7506
>     https://issues.apache.org/jira/browse/MESOS-7506
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, some tests tried to advance the clock until task status
> update was sent, while task's container was destroying. Container
> destruction consists of multiple steps, where some steps have a timeout
> specified, e.g. `cgroups::DESTROY_TIMEOUT`. So, there was a race
> between container destruction process and the loop that advanced the
> clock, leading to the following outcomes:
> 
>   (1) Container destroyed, before clock advancing reaches timeout.
> 
>   (2) Triggered timeout due to clock advancing, before container
>       destruction completes. That results in leaving orphaned
>       containers that will be detected by Slave destructor in
>       `tests/cluster.cpp`, so the test will fail.
> 
> This change gets rid of the loop and resumes clock after a single
> advancing of the clock.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp db337ba4e213820e7ad0c3f1b480388a2e456556 
>   src/tests/slave_tests.cpp 9c2db7adf45af4209ecc475d938ce4b77c5a3ab4 
> 
> 
> Diff: https://reviews.apache.org/r/63589/diff/1/
> 
> 
> Testing
> -------
> 
> 1. make check
> 2. internal ci (5x)
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message