mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 69812: Implemented the RPC retry logic for SLRP.
Date Fri, 25 Jan 2019 15:05:26 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69812/#review212322
-----------------------------------------------------------




src/csi/client.hpp
Lines 38-39 (original), 38-39 (patched)
<https://reviews.apache.org/r/69812/#comment298071>

    Ah, MESOS-8925 again showing its ugly face :(
    
    Just another datapoint, not necessarily a reason to wory about it right now and here.
    
    Here and below.



src/resource_provider/storage/provider.hpp
Lines 34-44 (patched)
<https://reviews.apache.org/r/69812/#comment298072>

    Let's just make these top-level constants in `provider.cpp`, e.g.,
    ```
    constexpr Duration DEFAULT_CSI_RETRY_BACKOFF_FACTOR =  Seconds(10);
    constexpr Duration DEFAULT_CSI_RETRY_INTERVAL_MAX = Minutes(10);
    ```
    
    Then we can always introduce properly named member variables (possibly initialized from
defaults) later.



src/resource_provider/storage/provider.cpp
Line 1881 (original), 1893 (patched)
<https://reviews.apache.org/r/69812/#comment298074>

    This comment reads slightly confusing now as it doesn't refer to the `loop`'s `iterate`
part, but just `getService`.
    
    How about
    ```
    Perform the call with the latest service future.
    ```



src/resource_provider/storage/provider.cpp
Lines 1918 (patched)
<https://reviews.apache.org/r/69812/#comment298076>

    `cstdlib` for `RAND_MAX`.



src/resource_provider/storage/provider.cpp
Lines 1920-1922 (patched)
<https://reviews.apache.org/r/69812/#comment298075>

    Does this actually increase the backoff every time? I guess the intention is that we pass
a `mutable` backoff by value, and then increase it every time. I have the feeling this only
works on the copy and doesn't propagate across iterations.
    
    I would have expected some function-local `shared_ptr` to a backoff which we'd pass by
value into the `loop` body as `mutable` and can use to pass the value across iterations. Not
sure there is a dedicated `loop` solution for this.



src/resource_provider/storage/provider.cpp
Lines 1933 (patched)
<https://reviews.apache.org/r/69812/#comment298070>

    I don't follow the reasoning around using `default` here, and it also makes it intransparent
which cases are not retried. I'd enumberate all 15 not retried cases (in which we also wouldn't
have to worry about unintentionally not retried future status codes).



src/resource_provider/storage/provider.cpp
Line 1901 (original), 1959 (patched)
<https://reviews.apache.org/r/69812/#comment298077>

    Do we want to include abandoned futures here?



src/resource_provider/storage/provider.cpp
Line 2674 (original), 2739 (patched)
<https://reviews.apache.org/r/69812/#comment298078>

    Let's add a trailing comment here,
    ```
    // Retry.
    ```
    
    I guess the semantics are that we retry if we are handling a call triggered by an offer
operation. Let's document that somewhere, e.g., in the commit message.



src/resource_provider/storage/provider.cpp
Line 2766 (original), 2831 (patched)
<https://reviews.apache.org/r/69812/#comment298079>

    Ditto.


- Benjamin Bannier


On Jan. 23, 2019, 8:10 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69812/
> -----------------------------------------------------------
> 
> (Updated Jan. 23, 2019, 8:10 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, James DeFelice, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9517
>     https://issues.apache.org/jira/browse/MESOS-9517
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the CSI plugin returns a retryable error (i.e., `DEADLINE_EXCEEDED`
> or `UNAVAILABLE`) for `CreateVolume` or `DeleteVolume` CSI calls, SLRP
> will now retry indefinitely with a random exponential backoff.
> 
> 
> Diffs
> -----
> 
>   src/csi/client.hpp 5d40d54c2abbd03993ce8835d37db23e209c7554 
>   src/csi/client.cpp 61ed410985099828a2f58b1527ab57daa4b379df 
>   src/resource_provider/storage/provider.hpp 331f7b785b14b814c2889488effd53f3a48a1b95

>   src/resource_provider/storage/provider.cpp d6e20a549ede189c757ae3ae922ab7cb86d2be2c

> 
> 
> Diff: https://reviews.apache.org/r/69812/diff/2/
> 
> 
> Testing
> -------
> 
> make check
> 
> A unit test will be added later in the chain.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message