mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Review Request 63751: Triggered 'ApplyOfferOperationMessage' for agent local resources.
Date Wed, 15 Nov 2017 23:37:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63751/#review191114
-----------------------------------------------------------




src/slave/slave.cpp
Lines 3743 (patched)
<https://reviews.apache.org/r/63751/#comment268701>

    For old operations, I think we should apply the conversion (i.e., change `totalResources`)
in `addOfferOperation` becuase we speculatively assume that it'll succeeds.
    
    This makes sense for old operation that has a resource provider. The current code is buggy
because we don't update agent `totalResources` for those old operations.
    
    Then, you don't have to do this assignment here anymore. We still need to calculate the
checkpointed resources so that we can call the old checkpointResources handler.
    
    One unfortunate thing is that `checkpointResources` method will also set `totalResources`.
This is in fact a bug:
    
    ```
      // This is a sanity check to verify that the new checkpointed
      // resources are compatible with the agent resources specified
      // through the '--resources' command line flag. The resources
      // should be guaranteed compatible by the master.
      Try<Resources> _totalResources = applyCheckpointedResources(
          info.resources(),
          newCheckpointedResources);
    
      CHECK_SOME(_totalResources)
        << "Failed to apply checkpointed resources "
        << newCheckpointedResources << " to agent's resources "
        << info.resources();
    
      totalResources = _totalResources.get();
    ```
    
    Setting agent resources to `_totalResources.get()` will break the agent resource accounting
(if resource provider has resources). We might need to introduce a parameter to `checkpointResources`
to not update `totalResources`.



src/tests/persistent_volume_tests.cpp
Lines 94 (patched)
<https://reviews.apache.org/r/63751/#comment268707>

    Can you add a comment about what that `bool` represents? Ditto in ReservationTest



src/tests/persistent_volume_tests.cpp
Line 355 (original)
<https://reviews.apache.org/r/63751/#comment268712>

    Hum, i don't expect you removing those checks. Just checking the offer is not sufficient
because we speculate.
    
    We should figure out a way to test that the reservation or persistent volume has actually
been checkpointed.
    
    One way is to add a helper in the fixture:
    ```
    struct OperationMessage
    {
      // The resources in the message.
      // 1) For CheckpointResourcesMessage, it's the checkpointed resources.
      // 2) For ApplyOfferOperationMessage, it's the resources in the operation.
      Resources resources;
    };
    
    Future<OperationMessage> message1 = getOperationMessage(slave.get()->pid);
    
    ...
    
    AWAIT_READY(message1);
    EXPECT_TRUE(message1.resources.contains(volume1));
    
    AWAIT_READY(message2);
    EXPECT_TRUE(message2.resources.contains(volume2));
    ```
    
    Ditto elsewhere.


- Jie Yu


On Nov. 15, 2017, 4:06 p.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63751/
> -----------------------------------------------------------
> 
> (Updated Nov. 15, 2017, 4:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8211
>     https://issues.apache.org/jira/browse/MESOS-8211
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Agents the have the 'RESOURCE_PROVIDER' capability set, should get sent
> 'ApplyOfferOperationMessage' instead of 'CheckpointResourcesMessage'.
> The agent will then figure out how to apply the operation. For agent
> local resources the agent will checkpoint resources.
> 
> 
> Diffs
> -----
> 
>   src/common/protobuf_utils.hpp b2aa3654db2fe7d7d9d275ded81c6d54244654ee 
>   src/common/protobuf_utils.cpp 34054846f93f19ba550afe58e2a899d111ad38dc 
>   src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f 
>   src/slave/slave.cpp d8edc5e6bbfa265bca4d19bbaa7db3063949dbc0 
>   src/tests/persistent_volume_tests.cpp acfeac16884b00581a3523607ff26f44f6dca53a 
>   src/tests/reservation_tests.cpp 470f7341686e69d0a71fb234a26b277c45c29780 
> 
> 
> Diff: https://reviews.apache.org/r/63751/diff/4/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message