mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.
Date Tue, 20 Nov 2018 03:24:35 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210695
-----------------------------------------------------------




src/master/master.cpp
Lines 8192-8201 (patched)
<https://reviews.apache.org/r/69163/#comment295419>

    We can get rid of this snippet and simply use `providerId`. Or, validate that `resourceProviderId`
is always the same as `providerId`.
    
    We should probably validate that both resources and operations have their resource provider
id set properly in the resource provider manager, and issue an `ERROR` event back to the resource
provider, instead of crashing the agent: https://github.com/apache/mesos/blob/master/src/resource_provider/manager.cpp#L907
    I created https://issues.apache.org/jira/browse/MESOS-9407 to capture this. No need to
address it in this patch.



src/master/master.cpp
Lines 8199 (patched)
<https://reviews.apache.org/r/69163/#comment295427>

    How about inlining this ternary operation into the use below so we can get rid of `resourceProviderId_`?



src/resource_provider/manager.cpp
Lines 887-888 (patched)
<https://reviews.apache.org/r/69163/#comment295431>

    It seems not cohesive that we include both the agent ID and the resource provider ID when
generating the `UpdateOperationStatusMessage` in SLRP, but drop them because `Call::UpdateOperationStatus`
does not have corresponding fields, then:
    1. Recover the resource provider ID in the RP manager, and
    2. Recover the agent ID in `Slave::handleResourceProviderMessage`.
    
    I was wondering if it is a good idea to add `slave_id` and `resource_provider_id` in `OperationStatus`
instead, then generate `OperationStatusUpdateMessage` based on `OperationStatus` for backward
compatibility.



src/resource_provider/storage/provider.cpp
Line 3079 (original), 3080-3081 (patched)
<https://reviews.apache.org/r/69163/#comment295426>

    We don't need to set up the slave ID and resource provider ID since dropped operations
will not be bookkept and retried.



src/slave/slave.cpp
Lines 4386 (patched)
<https://reviews.apache.org/r/69163/#comment295428>

    How about inlining the ternary operation into its use below so we don't need this `resourceProviderId_`?



src/tests/master_tests.cpp
Lines 9036 (patched)
<https://reviews.apache.org/r/69163/#comment295432>

    We might want to backport this patch back to 1.7.x. Can you split the tests into another
patch so we can minimize the backport?



src/tests/master_tests.cpp
Lines 9038 (patched)
<https://reviews.apache.org/r/69163/#comment295433>

    Since the resource provider manager is the one adding the resource provider ID, does it
make sense to move this test to `resource_provider_manager_tests.cpp`?
    
    If we make the decision to add `slave_id` and `resource_provider_id` in `OperationStatus`
and make the SLRP generates them, we should move this test to `storage_local_resource_provider_tests.cpp`.



src/tests/master_tests.cpp
Lines 9066-9070 (patched)
<https://reviews.apache.org/r/69163/#comment295429>

    Nit: they fit into a line. ;)


- Chun-Hung Hsiao


On Nov. 12, 2018, 8:49 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2018, 8:49 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
>     https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (no agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -----
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 1e326ec42a7f79a0835529a4655e7ec272f1cf40 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp c137fa4f13edc58d93c03a9dd32fdf9d38b38316

>   src/slave/slave.cpp 74f6fb9036a9ac4f587f53ec2df04eeb4c167bfb 
>   src/tests/master_tests.cpp ac6bf379c5906cf9612284911c121c9457f648a0 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message