mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov" <ruklet...@gmail.com>
Subject Re: Review Request 35702: Added /reserve HTTP endpoint to the master.
Date Mon, 13 Jul 2015 16:46:20 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


A high level question: do you think rescinding offers is a big deal for now?


src/master/http.cpp (line 507)
<https://reviews.apache.org/r/35702/#comment144877>

    The code until this line is basically request validation and authorization. Though it's
not how we do it now, do you think it makes sense to split the function into smaller logical
parts?
    
    How about something like this:
    
    ```
    Future<Response> Master::Http::reserve(const Request& request) const
    {
      return Master::Http::reserveValidate();
    }
    
    Future<Response> Master::Http::reserveValidate(const Request& request) const
    {
      <...>
      return Master::Http::reserveAuthorize();
    }
    
    <...>
    ```



src/master/http.cpp (lines 515 - 516)
<https://reviews.apache.org/r/35702/#comment144878>

    It looks like we actually have the role, but it's buried in resources. Do you envision
having resources collection with various roles in one request? Maybe it makes sense to add
a validation step which ensures there is just one role per request and use it here, also avoiding
changes in the `validate()`function.



src/master/http.cpp (lines 523 - 524)
<https://reviews.apache.org/r/35702/#comment144880>

    Let's leave a comment here, that we do want to defer the decision completely to an allocator,
but can do it currently because offers are issued and handled by the master.



src/master/http.cpp (lines 541 - 545)
<https://reviews.apache.org/r/35702/#comment144879>

    We basically defer the decision whether request can be granted or not to an allocator
(up to rescinding). Let's capture it in a comment!



src/master/master.hpp (line 962)
<https://reviews.apache.org/r/35702/#comment144851>

    I see that you extracted this function for code reusal, but let's document it. You may
want to add a comment for `applyResourceOperation()`and update the comment for `applyOfferOperation()`
as well.


- Alexander Rukletsov


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere,
and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches
and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation
Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from
the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must
not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources
will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to
itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many
offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(...,
None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy
the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic
and forward the request to the allocator since there may be available resources to satisfy
the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`.
This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers
as possible.
> The challenges of implementing the ideal solution in the current state is described in
the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message