mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Review Request 71343: Fixed out-of-order processing of terminal status updates in agent.
Date Fri, 23 Aug 2019 15:11:26 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71343/#review217399
-----------------------------------------------------------




src/slave/slave.cpp
Lines 6137-6138 (patched)
<https://reviews.apache.org/r/71343/#comment304683>

    Nit: I'd suggest to swap these two lines.


- Qian Zhang


On Aug. 22, 2019, 1:53 a.m., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71343/
> -----------------------------------------------------------
> 
> (Updated Aug. 22, 2019, 1:53 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, and Qian Zhang.
> 
> 
> Bugs: MESOS-9887
>     https://issues.apache.org/jira/browse/MESOS-9887
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, Mesos agent could send TASK_FAILED status update on
> executor termination while processing of TASK_FINISHED status update
> was in progress. Processing of task status updates involves sending
> requests to the containerizer, which might finish processing of these
> requests out-of-order, e.g. `MesosContainerizer::status`. Also,
> the agent does not overwrite status of the terminal status update once
> it's stored in the `terminatedTasks`. Hence, there was a race condition
> between two terminal status updates.
> 
> Note that V1 Executors are not affected by this problem because they
> wait for an acknowledgement of the terminal status update by the agent
> before terminating.
> 
> This patch introduces a new data structure `pendingStatusUpdates`,
> which holds a list of status updates that are being processed. This
> data structure allows validating the order of processing of status
> updates by the agent.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp a17bbee13cb8291ad694f1520b613764b57b046b 
>   src/slave/slave.cpp 1d0ec9d2428c3ffa28ad3e960b74f171013cf0c2 
> 
> 
> Diff: https://reviews.apache.org/r/71343/diff/2/
> 
> 
> Testing
> -------
> 
> 1. manual testing described in MESOS-9887
> 2. internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message