mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.
Date Tue, 27 Nov 2018 18:04:09 GMT


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/master/master.cpp
> > Lines 8192-8201 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2106996#file2106996line8192>
> >
> >     We can get rid of this snippet and simply use `providerId`. Or, validate that
`resourceProviderId` is always the same as `providerId`.
> >     
> >     We should probably validate that both resources and operations have their resource
provider id set properly in the resource provider manager, and issue an `ERROR` event back
to the resource provider, instead of crashing the agent: https://github.com/apache/mesos/blob/master/src/resource_provider/manager.cpp#L907
> >     I created https://issues.apache.org/jira/browse/MESOS-9407 to capture this.
No need to address it in this patch.

Dropping since this is not an issue anymore in the updated code.


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/master/master.cpp
> > Lines 8199 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2106996#file2106996line8199>
> >
> >     How about inlining this ternary operation into the use below so we can get rid
of `resourceProviderId_`?

Ditto.


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/manager.cpp
> > Lines 887-888 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2106997#file2106997line887>
> >
> >     It seems not cohesive that we include both the agent ID and the resource provider
ID when generating the `UpdateOperationStatusMessage` in SLRP, but drop them because `Call::UpdateOperationStatus`
does not have corresponding fields, then:
> >     1. Recover the resource provider ID in the RP manager, and
> >     2. Recover the agent ID in `Slave::handleResourceProviderMessage`.
> >     
> >     I was wondering if it is a good idea to add `slave_id` and `resource_provider_id`
in `OperationStatus` instead, then generate `OperationStatusUpdateMessage` based on `OperationStatus`
for backward compatibility.

I did the suggested change in the preceeding patch, dropping.


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/storage/provider.cpp
> > Line 3079 (original), 3080-3081 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2106998#file2106998line3080>
> >
> >     We don't need to set up the slave ID and resource provider ID since dropped
operations will not be bookkept and retried.

This piece of code is now gone, but I think it makes sense to provide semantics as consistent
as possible to schedulers, so I'd prefer to set as many fields as possible, even if redundant.

Dropping.


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 4386 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2106999#file2106999line4386>
> >
> >     How about inlining the ternary operation into its use below so we don't need
this `resourceProviderId_`?

Reworked this code, droppping.


> On Nov. 20, 2018, 4:24 a.m., Chun-Hung Hsiao wrote:
> > src/tests/master_tests.cpp
> > Lines 9038 (patched)
> > <https://reviews.apache.org/r/69163/diff/3/?file=2107000#file2107000line9038>
> >
> >     Since the resource provider manager is the one adding the resource provider
ID, does it make sense to move this test to `resource_provider_manager_tests.cpp`?
> >     
> >     If we make the decision to add `slave_id` and `resource_provider_id` in `OperationStatus`
and make the SLRP generates them, we should move this test to `storage_local_resource_provider_tests.cpp`.

I reworked the code so now RPs inject RP IDs, and agents inject agent IDs. The test here really
only tests the agent part, as we use a `MockResourceProvider`. We could probably move it to
`slave_tests.cpp`, but at the same time we also test an master API.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210695
-----------------------------------------------------------


On Nov. 27, 2018, 7 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2018, 7 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
>     https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (no agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -----
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp b4b02d8b4d7d6d1aabda1f97b9bf824419f76a9e 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp c137fa4f13edc58d93c03a9dd32fdf9d38b38316

>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/master_tests.cpp 651bb9ba9298fc3d7179ed86487aa55be519ce12 
>   src/tests/mesos.hpp 576f4bde88c069ee2fa0dd33912a034437338e7e 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/4/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message