mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 67762: Added minimum capability check during master recovery.
Date Wed, 27 Feb 2019 02:04:34 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67762/#review213241
-----------------------------------------------------------



Commit description could use some extra context.  Not sure if you want to use another example
(instead of quota limits) to illustrate how the list of minimum capabilities should be updated.

Suggestion:
```
Upon recovery, the master will compare its own capabilities against the
list of strings read from the registry.  If the master is missing any
of the capabilities read from the registry, the master will refuse to
recover and try to provide remediation steps instead.

Modifications to the registry's list will be added in future, within
the logic for other RegistryOperations.  For example, if support for
Quota limit bursting (MESOS-8068) is added to a future master,
setting a quota limit should cause the UpdateQuota RegistryOperation
to add the appropriate minimum capability.  A RemoveQuota 
RegistryOperation would remove the minimum capability if there are no
other quota limits in use.

Also adds a dedicated test.
```


src/master/master.hpp
Lines 348 (patched)
<https://reviews.apache.org/r/67762/#comment299142>

    Nit: s/an hashset of missing minimum capabilities in string/the set of capabilities missing
from this master/



src/master/master.cpp
Lines 1680 (patched)
<https://reviews.apache.org/r/67762/#comment299144>

    A forced recovery doesn't really make sense, if we're trying to prevent undefined behavior.
 If a recovery must be forced, and lacking any other context, my recommendation would be purging
the registry and starting the cluster anew.



src/master/master.cpp
Lines 1682-1684 (patched)
<https://reviews.apache.org/r/67762/#comment299145>

    I wonder if we should start a document (even blank?) which lists each minimum capability,
some background, and how to recover from each.
    
    Using quota limits as an example, the recovery steps might be to launch the newer master,
and send operator calls modifying all quotas into a backwards compatible form.
    
    ---
    
    We can then link the document in the error message as a fallback.



src/master/master.cpp
Lines 1685-1687 (patched)
<https://reviews.apache.org/r/67762/#comment299146>

    Nit: Suggestion:
    ```
    << "Master is missing the following minimum capabilities: "
    << strings::join<hashset<string>>(", ", missingCapabilities)
    << ". See the following documentation for steps to safely "
    << "recover from this state: "
    << "http://mesos.apache.org/documentation/latest/compatibility";  // <-- Just
a possible link.
    ```


- Joseph Wu


On Feb. 20, 2019, 10:46 a.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67762/
> -----------------------------------------------------------
> 
> (Updated Feb. 20, 2019, 10:46 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Joseph Wu.
> 
> 
> Bugs: MESOS-8880
>     https://issues.apache.org/jira/browse/MESOS-8880
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added minimum capability check during master registry
> recovery to guard against incompatible downgrade.
> See MESOS-8878.
> 
> Also added a dedicated test.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 48f30b3f41f3440245c1912becc9c2c3d572aff9 
>   include/mesos/v1/mesos.proto e07dd9e42939fddcff5d15072a143d9c4c44dd3d 
>   src/master/master.hpp ccd117f607747d49e5259d9ba6645fed61811adf 
>   src/master/master.cpp 015da54583448a8d102d8e401e48bd228baf6dd6 
>   src/tests/master_tests.cpp 5ae8e1cea3ca87551093bd63d744ac807ac7797a 
> 
> 
> Diff: https://reviews.apache.org/r/67762/diff/4/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message