mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Review Request 63107: Added operation feedback for storage operations.
Date Thu, 19 Oct 2017 01:08:46 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63107/#review188618
-----------------------------------------------------------



Not a complete review yet. Will take look again once existing issues are addressed.


src/master/master.cpp
Line 4524 (original), 4526 (patched)
<https://reviews.apache.org/r/63107/#comment265621>

    We probably needs to send `OfferOperation` too here?



src/master/master.cpp
Lines 5103-5107 (original), 5105-5109 (patched)
<https://reviews.apache.org/r/63107/#comment265615>

    We shouldn't use `Resources::apply` here. `apply` to me is to apply a conversion. This
is basically consume resources! Follow your logic, why we don't do anything for LAUNCH? is
it consuming resources too?
    
    The following `slave->apply` is problematic. It'll reduce slave's `totalResources`?



src/master/master.cpp
Line 5111 (original), 5113 (patched)
<https://reviews.apache.org/r/63107/#comment265616>

    Again, I won't call it "Applying", i would call it "Processing"



src/master/master.cpp
Lines 5126 (patched)
<https://reviews.apache.org/r/63107/#comment265619>

    I think putting operations in each Framework struct makes more sense. It'll help us to
see all pending operations from a given framework in state.json or in UI in the future.
    
    I'd suggest we have a `Framework::addOfferOperation(...)` method.
    
    ```
    Framework::addOfferOperation(
        const Offer.Operation& info,
        const string& uuid)
    ```



src/master/master.cpp
Line 5157 (original), 5161 (patched)
<https://reviews.apache.org/r/63107/#comment265613>

    This will update `checkpointedResources`, even if this operation is for a resource provider
provided resources. Do we want to do that? I think `checkpointedResources` should be limited
to agent default resources (no resource provider).



src/master/master.cpp
Lines 7080 (patched)
<https://reviews.apache.org/r/63107/#comment265622>

    Do we want to set this if it's old operations like RESERVE?



src/master/master.cpp
Lines 7083 (patched)
<https://reviews.apache.org/r/63107/#comment265624>

    slave->totalResources should not change (just like launching a task does not change
slave's total).
    
    you should do a Resource::apply here for new operations with additional `converted_resources`:
    
    ```
    Resources::apply(
        const Offer.Operation& operation,
        const Option<Resources>& converted_resources = None())
    ```



src/master/master.cpp
Lines 7085 (patched)
<https://reviews.apache.org/r/63107/#comment265625>

    Let's make sure `checkpointedResources` is only for agent default resources. We only need
that because the new master old agent case (depends on agent capabilities, we'll still need
to send CheckpointResourcesMessage



src/master/master.cpp
Lines 7088-7089 (patched)
<https://reviews.apache.org/r/63107/#comment265627>

    This shouldn't be necessary.



src/resource_provider/manager.cpp
Lines 318-324 (original), 319-325 (patched)
<https://reviews.apache.org/r/63107/#comment265628>

    Hum, shouldn't we have a validation function for this? I think a single operation cannot
operate on resources from different providers.



src/resource_provider/manager.cpp
Lines 329-332 (original), 330-333 (patched)
<https://reviews.apache.org/r/63107/#comment265629>

    We probably need to maintain state for resource providers. What if a resource provider
is temporarily disconnected?
    
    Chatted with CHun on this yesterday. I think it makes sense to decouple RP regisration
from RP reporting resources.
    
    As a result, each RP will have `CONNECTED` as well as `READY` state.
    
    CONNECTED means that the RP subscribed, but hasn't reported any resources yet. Operation
should be dropped in that case.



src/resource_provider/manager.cpp
Line 330 (original), 331 (patched)
<https://reviews.apache.org/r/63107/#comment265630>

    We need to log any operation dropping in the log.



src/resource_provider/manager.cpp
Lines 344-345 (original), 345-346 (patched)
<https://reviews.apache.org/r/63107/#comment265631>

    This is the same as dropping an operation. Please log the operation.



src/resource_provider/message.hpp
Line 36 (original), 37 (patched)
<https://reviews.apache.org/r/63107/#comment265633>

    Not yours, this should probably be `UpdateState`



src/resource_provider/message.hpp
Lines 46 (patched)
<https://reviews.apache.org/r/63107/#comment265634>

    I'd call it `UpdateOfferOperationStatus`


- Jie Yu


On Oct. 18, 2017, 2:39 p.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63107/
> -----------------------------------------------------------
> 
> (Updated Oct. 18, 2017, 2:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8087
>     https://issues.apache.org/jira/browse/MESOS-8087
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When a framework accepts an offer that contains resource provider
> resources with a storage operation (`CREATE_VOLUME`, `DESTROY_VOLUME`,
> `CREATE_BLOCK`, `DESTROY_BLOCK`), the result of this operation cannot
> be guessed and will only be known after the operation has been
> successfully applied by the resource provider.
> This patch introduces the necessary handling for such operations. The
> internal bookkeeping of available resources in the master and agent has
> been updated to update resources only after operation feedback has been
> received. This ensures that converted resources can only be offered
> after the operation was executed by a resource provider.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 0ddc98259f64b3921d08f5f4ec81543bb0826378 
>   src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
>   src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
>   src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
>   src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
>   src/slave/slave.cpp 4d7dc8e9a3901b00103031e24e5d6328d0f2e2ad 
> 
> 
> Diff: https://reviews.apache.org/r/63107/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message