mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 70168: Added `ServiceManager` to manage CSI plugin container lifecycles.
Date Tue, 19 Mar 2019 13:12:12 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70168/#review213774
-----------------------------------------------------------



Since this RR adds code which was/is removed in https://reviews.apache.org/r/70169/ I believe
they should be a single refactoring patch (at least that's how I reviewed them both). Ideally
we would have one patch moving code out of the SLRP, and one or more other patches to do cleanups,
if only to make it easier to confirm that we do not introduce new bugs. This RR does way too
many things to rule that out with much confidence.


src/csi/metrics.cpp
Lines 44 (patched)
<https://reviews.apache.org/r/70168/#comment299798>

    Let's call out here that we fall through intentionally.



src/csi/service_manager.cpp
Lines 113 (patched)
<https://reviews.apache.org/r/70168/#comment299771>

    Separate code blocks by empty lines from surrounding text.



src/csi/service_manager.cpp
Lines 123-125 (patched)
<https://reviews.apache.org/r/70168/#comment299800>

    This docstring doesn't reflect the actual return type.
    
    I find boolean values are always tricky as they can require a lot of non-local context
to interpret. How about we just use the previous return type `[(ContainerID, Option(ContainerStatus)]`?



src/csi/service_manager.cpp
Lines 133 (patched)
<https://reviews.apache.org/r/70168/#comment299804>

    The _if necessary_ part isn't clear to me, and I am not sure a user of this function needs
to know. Maybe just something along the lines of _waits for the endpoint socket to be ready_
and that some timeout is involved. We should also document that `endpoint` is a unix URI.
    
    ```
    // Returns a `Future` ...
    ```



src/csi/service_manager.cpp
Lines 137 (patched)
<https://reviews.apache.org/r/70168/#comment299805>

    ```
    // Returns a `Future` of the URI(?) of the ...
    ```



src/csi/service_manager.cpp
Lines 139 (patched)
<https://reviews.apache.org/r/70168/#comment299809>

    Do we need to mention the daemon here? Maybe just
    ```
    If the container is not already running, this method will start a new container.
    ```



src/csi/service_manager.cpp
Lines 155-156 (patched)
<https://reviews.apache.org/r/70168/#comment299817>

    Since `Owned` can be shared, let's make this class explicitly non-copyable to prevent
lifetime issues, e.g.,
    ```
    // Since this class contains `Owned` members which should not but can
    // be copied, explicitly make this class non-copyable.
    //
    // TODO(chhsia0): Remove this once MESOS-5122 is fixed.
    ServiceManagerProcess(const ServiceManagerProcess&) = delete;
    ServiceManagerProcess& operator=(const ServiceManagerProcess&) = delete;
    ```



src/csi/service_manager.cpp
Lines 184-185 (patched)
<https://reviews.apache.org/r/70168/#comment299818>

    In Mesos we typically expand types unless they are already mentioned somewhere in the
statement or are nested types (e.g., iterators).
    ````
    foreach (const CSIPluginContainerInfo::Service& service, services) {
      foreach (const CSIPluginContainerInfo& container, info.containers()) {
        ...
      }
    }
    ````



src/csi/service_manager.cpp
Lines 186-191 (patched)
<https://reviews.apache.org/r/70168/#comment299819>

    Let's either add a comment here why it is always okay to only ever store the first found
service, or maybe even better separate input validation from setting up our own state.



src/csi/service_manager.cpp
Lines 235-236 (patched)
<https://reviews.apache.org/r/70168/#comment299823>

    Let's keep the previous code from SLRP process `recover` and the previous `getContainers`
return type.



src/csi/service_manager.cpp
Lines 260 (patched)
<https://reviews.apache.org/r/70168/#comment299824>

    If we rewrap the comment we can `s/despite if/even though/`. Ideally we'd do that in a
trivial cleanup patch.



src/csi/service_manager.cpp
Lines 264 (patched)
<https://reviews.apache.org/r/70168/#comment299825>

    This shows how a unnamed boolean is much harder to understand than variable that carries
clear semantics like the previous `isRunningContainer`.



src/csi/service_manager.cpp
Lines 316 (patched)
<https://reviews.apache.org/r/70168/#comment299772>

    Let's just use the `static` function like we had previously,
    ```
    static ContainerID getContainerId(
        const CSIPluginInfo& info,
        const CSIPluginContainerInfo& container);
    ```



src/csi/service_manager.cpp
Lines 332-338 (patched)
<https://reviews.apache.org/r/70168/#comment299773>

    Maybe the following instead?
    ````
    auto it = std::find_if(
        info.containers().begin(),
        info.containers().end(),
        [this, containerId](const CSIPluginContainerInfo& info) {
          return getContainerId(info) == containerId;
        });
        
    return it != info.containers().end()
             ? *it
             : Option<CSIPluginContainerInfo>::none();
    ````



src/csi/service_manager.cpp
Lines 354 (patched)
<https://reviews.apache.org/r/70168/#comment299799>

    Let's not hide that we capture `this` here as such lambdas can be tricky.



src/csi/service_manager.cpp
Lines 375 (patched)
<https://reviews.apache.org/r/70168/#comment299801>

    Let's add a comment here that and how we namespace our containers.



src/csi/service_manager.cpp
Lines 380-386 (patched)
<https://reviews.apache.org/r/70168/#comment299802>

    Once we change the return type of this lambda we can get rid of this comment and all the
coupling it needs to call out.



src/csi/service_manager.cpp
Lines 389 (patched)
<https://reviews.apache.org/r/70168/#comment299803>

    As `result` has a different type than the function return value we could `move` here.



src/csi/service_manager.cpp
Lines 460 (patched)
<https://reviews.apache.org/r/70168/#comment299806>

    Seems like we don't need to defer to self?



src/csi/service_manager.cpp
Lines 485 (patched)
<https://reviews.apache.org/r/70168/#comment299807>

    `s/[=]/[this]/`



src/csi/service_manager.cpp
Lines 496-506 (patched)
<https://reviews.apache.org/r/70168/#comment299808>

    I find the data flow in this way of chaining confusing (e.g., not immediately clear who
generated the argument to this continuation).
    
    Can we first handle the `then` case and only then the `onAny`?



src/csi/service_manager.cpp
Lines 514-517 (patched)
<https://reviews.apache.org/r/70168/#comment299810>

    This feels weird. Why can't we just do
    ```
    if (endpoints.contains(containerId)) {
      return endpoints.at(containerId)->future();
    }
    
    CHECK(!daemons.contains(containerId));
    ```



src/csi/service_manager.cpp
Lines 519-520 (patched)
<https://reviews.apache.org/r/70168/#comment299811>

    How about avoiding the hard assertion here and instead returning a `Failure`?
    
    Since this was in the existing code, so maybe let's do any cleanup in a separate patch.



src/csi/service_manager.cpp
Lines 551 (patched)
<https://reviews.apache.org/r/70168/#comment299812>

    We don't really use such abbreviated names, maybe `s/endpointVar/endpoint_/` or even `endpoint`
if we inline the temporary?
    ```
    Environment::Variable* endpoint =
      commandInfo.mutable_environment()->add_variables();
    endpoint->set_name("CSI_ENDPOINT");
    endpoint->set_value("unix://" + endpointPath.get());
    ```



src/csi/service_manager.cpp
Lines 614 (patched)
<https://reviews.apache.org/r/70168/#comment299814>

    Should we have a metric for the start action as well?



src/csi/service_manager.cpp
Lines 620-621 (patched)
<https://reviews.apache.org/r/70168/#comment299813>

    From this log line it isn't really clear that this is the teardown action for above `Connecting
to endpoint ...`. Maybe instead `Disconnected from endpoint ...`?



src/csi/service_manager.cpp
Lines 644-645 (patched)
<https://reviews.apache.org/r/70168/#comment299815>

    Not added here, but this assumes than an `Owned` can be shared. Let's clean that up in
a follow-up.



src/csi/service_manager.cpp
Lines 646 (patched)
<https://reviews.apache.org/r/70168/#comment299816>

    Looks good, not a simple code move though (also, `recover` additionally handles abandoned
futures) ;)
    
    Might want to make the capture list explicit (`[this, containerId]`).


- Benjamin Bannier


On March 12, 2019, 7:59 p.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70168/
> -----------------------------------------------------------
> 
> (Updated March 12, 2019, 7:59 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9632
>     https://issues.apache.org/jira/browse/MESOS-9632
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `ServiceManager` is agnostic to CSI versions, so can be used to
> manage plugin containers for both CSI v0 and v1 plugins. Most of its
> logic are adapted from the SLRP code.
> 
> We also separate the CSI plugin metrics from SLRP metrics object so the
> metrics can be accessed by `ServiceManager`. However, we do not make
> `ServiceManager` own the CSI plugin metrics object because eventually we
> would like to decouple metrics lifecycles from SLRP lifecycles.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3397c3b1d4e8a7900b2e5f870679cc7aa30b4be2 
>   src/Makefile.am d451d7cabe3bf5d4f5039cfac5de1b03ef891d07 
>   src/csi/metrics.hpp PRE-CREATION 
>   src/csi/metrics.cpp PRE-CREATION 
>   src/csi/service_manager.hpp PRE-CREATION 
>   src/csi/service_manager.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70168/diff/6/
> 
> 
> Testing
> -------
> 
> Testing done later in chain.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message