mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Yu" <yujie....@gmail.com>
Subject Re: Review Request 35702: Added /reserve HTTP endpoint to the master.
Date Wed, 05 Aug 2015 05:46:26 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------

Ship it!


LGTM overall. Please address the remaining issues and commit yourself!


src/master/http.cpp (line 475)
<https://reviews.apache.org/r/35702/#comment148677>

    We typically use leading undescore for temp variables. The tailing underscore is for class
members (following google style).
    
    In fact, I think the temp variable here is not necessary. There are only two places where
this temp variable is used. I would rather use 'values.get().at("...")', but this is up to
you.



src/master/http.cpp (line 498)
<https://reviews.apache.org/r/35702/#comment148678>

    Do you need this temp variable. Looks like you can just do
    ```
    foreach (.. value, parse.get().values) {...
    ```



src/master/http.cpp (line 511)
<https://reviews.apache.org/r/35702/#comment148679>

    Kill this line.



src/master/http.cpp (line 533)
<https://reviews.apache.org/r/35702/#comment148685>

    What does 'recovered' mean and what does remaining mean? It'll be great if you comment
on that to improve readability.
    
    For example,
    
    ```
    // The resources recovered by recinding outstanding offers.
    ...
    // The unreserved resources needed to satify the RESERVE operation.
    ```
    
    IIUC, the variable 'remaining' is only used for optimization, right? Could you please
mention that in the comments that keep this variable is for optimization (i.e., avoid rescinding
unnecessary offers).



src/master/http.cpp (line 534)
<https://reviews.apache.org/r/35702/#comment148683>

    I don't like the name 'flatten' :(
    
    Could you at least be more explicit about it (i.e., emphasize that 'remaining' only has
unreserved resources). 
    
    ```
    Resources remaining = resources.flatten('*');
    ```



src/master/http.cpp (line 553)
<https://reviews.apache.org/r/35702/#comment148686>

    you win the race because the default filter refuse_sec is 5 seconds? Worth mentioning
that in the comment.



src/master/http.cpp (line 573)
<https://reviews.apache.org/r/35702/#comment148727>

    What is 'Nothing' here?



src/master/master.cpp (line 5472)
<https://reviews.apache.org/r/35702/#comment148729>

    The name sounds weired. You are passing in an offer operation but the function name is
called 'applyResourceOperation'.
    
    I would suggest we create two 'Master::apply' overloads and don't worry about the code
duplication.
    
    ```
    void apply(framework, slave, opeartion);
    Future<Nothing> apply(slave, operation);
    ```


- Jie Yu


On July 31, 2015, 9:56 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 9:56 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere,
and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches
and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation
Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from
the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must
not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources
will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to
itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many
offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(...,
None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy
the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic
and forward the request to the allocator since there may be available resources to satisfy
the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`.
This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers
as possible.
> The challenges of implementing the ideal solution in the current state is described in
the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3772e39015a22655dcad00ad844dc5ddc90db43f 
>   src/master/master.hpp ea18c4e0bb0743747401b9cd5ea14ae9b56ae3cc 
>   src/master/master.cpp 351a3c2b5f551ad065682cea601d2436258e4544 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message