mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 69811: Preliminary SLRP refactoring for RPC retry.
Date Wed, 23 Jan 2019 22:23:05 GMT


> On Jan. 23, 2019, 2:34 p.m., Benjamin Bannier wrote:
> > This looks in general like a great simplification, but it does at least superficually
look like this is more that a (non-functional) refactoring; we now do pull some internal state
before making some calls, but the actual call to the plugin is deferred there. I.e., before
we had
> > ````
> > handle = getService; THEN
> >   request = ..internal state..
> >   send(handle, request)
> > ````
> > while now we could have
> > ````
> > request = .. internal state..
> > handle = call; THEN
> >   send(handle, request) 
> > ````
> > where the state we depend on for the request could change (e.g., from concurrently
executing calls) between us being able to obtain a handle with `call` and actually sending
the request with `_call`.
> > 
> > Is there anything guaranteeing that this is safe (e.g., some general idempotency
guarantee or sim)?

Very good point.

I through through this, and believe it is safe. The CSI RPC can be classified into the following
3 categories:
1. Plugin information collection: `GetPluginCapabilities`, `GetPluginInfo`, `ControllerGetCapabilities`,
`NodeGetCapabilities`, `NodeGetId`, `ListVolumes`. These calls do not depend on SLRP states,
so it is safe to construct the request proto messages beforehand.
2. Volume lifecycle transition: `DeleteVolume`, `ControllerPublishVolume`, `ControllerUnpublishVolume`,
`NodeStageVolume`, `NodeUnstageVolume`, `NodePublishVolume`, `NodeUnpublishVolume`. For these
calls, SLRP maintains intermediate states (e.g., `CONTROLLER_PUBLISH`) for each volume, and
neither the volume properties are changed nor a concurrent call can be made (since these calls
are triggered by `LAUNCH`, `LAUNCH_GROUP`, or `DESTROY_DISK`, which would lock the resources)
after transistioning into these intermediate states, so it is safe to construct the request
proto messages first and then make the call asynchronously.
3. Profile-related operations: `CreateVolume`, `ValidateVolumeCapabilities`, `GetCapacity`.
These calls are to be worried about. However we maintain an invariant that there is no out-standing
`CREATE_DISK` or `DESTROY_DISK` or any storage pool reconciliation while the profiles are
being updated. So when these calls are being made, profiles should remain the same.
In summary, no concurrent operation should interfere the call being made.

I'll add comments to explain these details.


> On Jan. 23, 2019, 2:34 p.m., Benjamin Bannier wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 395 (patched)
> > <https://reviews.apache.org/r/69811/diff/1/?file=2121399#file2121399line395>
> >
> >     We don't seem to need to disable this for `PROBE`. If we need it let's at least
add comment for context.

The comment is at L1932. Let me move them here.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69811/#review212233
-----------------------------------------------------------


On Jan. 23, 2019, 7:07 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69811/
> -----------------------------------------------------------
> 
> (Updated Jan. 23, 2019, 7:07 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9517
>     https://issues.apache.org/jira/browse/MESOS-9517
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch refactors the `StorageLocalResourceProvider::call` function
> to obtain the latest service future through `getService` before making
> the actual RPC call. The subsequent patch would utilize this to support
> RPC retry across plugin restarts.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/storage/provider.cpp d6e20a549ede189c757ae3ae922ab7cb86d2be2c

> 
> 
> Diff: https://reviews.apache.org/r/69811/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message