mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rukletsov <ruklet...@gmail.com>
Subject Re: Review Request 63577: Fixed a task status update race in default executor tests.
Date Tue, 21 Nov 2017 17:35:52 GMT


> On Nov. 16, 2017, 11:45 a.m., Alexander Rukletsov wrote:
> > src/tests/default_executor_tests.cpp
> > Lines 926-928 (original), 1027-1029 (patched)
> > <https://reviews.apache.org/r/63577/diff/3/?file=1892762#file1892762line1167>
> >
> >     What happens if a scheduler gets `TASK_FAILED` status update before destruction?
Will the test fail with an "uninteresting mock function call"? Or there will be no failure
because the expectation are specific to the state and we just get a warning in the log? I
believe the latter, but could you please check this?
> 
> Qian Zhang wrote:
>     Can you please elaborate a bit about why the scheduler will get `TASK_FAILED` status
update?

We do not explicitly wait for those tasks to finish. Container can be destroyed and `TASK_FAILED`
can be sent to the scheduler shall it still exist. However, I think we will simply get a warning,
just wanted to make sure my understanding is correct.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63577/#review191169
-----------------------------------------------------------


On Nov. 20, 2017, 2:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63577/
> -----------------------------------------------------------
> 
> (Updated Nov. 20, 2017, 2:46 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Gaston Kleiman.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously in the test `DefaultExecutorTest.KillMultipleTasks` and
> `DefaultExecutorTest.KillTaskGroupOnTaskFailure`, when launching a
> task group which has multiple tasks, we expected the scheduler will
> receive all the TASK_STARTING status updates before receiving any
> TASK_RUNNING status updates. However this is not guaranteed, e.g.,
> it is possible for the scheduler to receive TASK_RUNNING for the
> first task before receiving TASK_STARTING for the second task.
> 
> So in this patch, we used `Sequence` to guarantee the order of
> TASK_STARTING and TASK_RUNNING for each task but do not care about
> the order between tasks.
> 
> The following 3 tests have their own solutions to handle this issue,
> in this patch, I updated them to use the above solution.
>   `DefaultExecutorTest.KillTask`
>   `DefaultExecutorTest.CommitSuicideOnKillTask`
>   `DefaultExecutorTest.ROOT_ContainerStatusForTask`
> 
> 
> Diffs
> -----
> 
>   src/tests/default_executor_tests.cpp 65150212ae36094325c4f7f12b6ef135cd953170 
>   src/tests/mesos.hpp 345b883a8c629bf5bed83e9236632c277f2eb0eb 
> 
> 
> Diff: https://reviews.apache.org/r/63577/diff/4/
> 
> 
> Testing
> -------
> 
> This patch touched 5 tests:
> 1. KillTask
> 2. KillMultipleTasks
> 3. KillTaskGroupOnTaskFailure
> 4. CommitSuicideOnKillTask
> 5. ROOT_ContainerStatusForTask
> 
> I ran each of them repeatedly (20 times), and all of them succeeded.
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message