mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 70995: Fixed bugs in updating framework roles in the hierarchial allocator.
Date Wed, 03 Jul 2019 20:58:32 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70995/#review216356
-----------------------------------------------------------


Fix it, then Ship it!




Clean fix, thanks!

Looks to me like if we incrementally update `framework.roles`, then we have some clear code
to pull out into `addSubscribedRole(Framework*, role)` and `removeSubscribedRole(Framework*,
role)`, but we can do this later.


src/master/allocator/mesos/hierarchical.cpp
Lines 531 (patched)
<https://reviews.apache.org/r/70995/#comment303553>

    Maybe just:
    
    ```
      CHECK_EQ(framework.suppressedRoles, suppressedRoles);
    ```



src/master/allocator/mesos/hierarchical.cpp
Line 1397 (original), 1358 (patched)
<https://reviews.apache.org/r/70995/#comment303556>

    We should probably rename this to unsuppressRole (in a different change).



src/master/allocator/mesos/hierarchical.cpp
Lines 1361-1362 (patched)
<https://reviews.apache.org/r/70995/#comment303554>

    This is double logging and its an inconsistency between suppressRoles and unsuppressRoles?
    
    It's also saying "revived" but it's not clearing the filters? (which is part of revive)
    
    Per the last review, let's just log that we unsuppressed roles here.


- Benjamin Mahler


On July 2, 2019, 7:12 p.m., Andrei Sekretenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70995/
> -----------------------------------------------------------
> 
> (Updated July 2, 2019, 7:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9870
>     https://issues.apache.org/jira/browse/MESOS-9870
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch replaces the largest part of the logic around
> suppressing/unsuppressing framework roles in the 'updateFramework()'
> method with calling 'suppressRoles()'/'unsuppress()' methods, which
> are used for processing SUPPRESS/REVIVE calls.
> 
> This fixes bugs which are triggered by simulatneous updates of
> framework's roles and its suppressed roles (see MESOS-9870), namely:
>  - master crashes due to changing nonexistent role metrics
>  - master crash due to activating removed frameworkSorter
>  - spurious activation of a role added in suppressed state
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 26aad6778f12b99bb87c846788d6b6d60f743d8a

> 
> 
> Diff: https://reviews.apache.org/r/70995/diff/1/
> 
> 
> Testing
> -------
> 
> make check + new tests in dependent patches
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message