mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meng Zhu <m...@mesosphere.io>
Subject Re: Review Request 70283: Improved handling of resources consumed by orphan operations.
Date Sat, 23 Mar 2019 00:44:37 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70283/#review213938
-----------------------------------------------------------



As pointed out by Greg earlier, this patch violates the comment here: https://github.com/apache/mesos/blob/4580834471fb3bc0b95e2b96e04a63d34faef724/src/master/allocator/mesos/hierarchical.cpp#L769-L770
While I think the current code seems to work with this patch (except the comment mentioned
above), I think we should check with @Bbannier. In particular, the `TODO` here https://github.com/apache/mesos/blob/4580834471fb3bc0b95e2b96e04a63d34faef724/src/master/master.cpp#L8330-L8334:

```
            // TODO(bbannier): Consider introducing ways of making sure an agent
            // always knows the `FrameworkInfo` of operations triggered on its
            // resources, e.g., by adding an explicit `FrameworkInfo` to
            // operations like is already done for `RunTaskMessage`, see
            // MESOS-8582.

```
Preferably, it will be great if we can fix the above TODO and ensure frameworkInfo is always
available.


src/master/master.cpp
Line 8384 (original), 8371 (patched)
<https://reviews.apache.org/r/70283/#comment300050>

    can we separate out irrelevant changes?


- Meng Zhu


On March 22, 2019, 1:37 p.m., Gastón Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70283/
> -----------------------------------------------------------
> 
> (Updated March 22, 2019, 1:37 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Greg Mann, Joseph Wu, and Meng Zhu.
> 
> 
> Bugs: MESOS-9635
>     https://issues.apache.org/jira/browse/MESOS-9635
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch makes the master's `UpdateSlaveMessage` handler include
> resources consumed by orphan operations when calling
> `allocator->addResourceProvider()`.
> 
> The change prevents some races that lead to the master reoffering the
> resources consumed by the operations and makes the
> `OperationReconciliationTest.AgentPendingOperationAfterMasterFailover`
> test stable.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 9c4a9e83da94535873d72c902835f229c4f96320 
> 
> 
> Diff: https://reviews.apache.org/r/70283/diff/1/
> 
> 
> Testing
> -------
> 
> `OperationReconciliationTest.AgentPendingOperationAfterMasterFailover` passed over 5000
iterations under stress. Other tests still pass on GNU/Linux.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message