mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 70168: Added `ServiceManager` to manage CSI plugin container lifecycles.
Date Wed, 20 Mar 2019 03:43:26 GMT


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 123-125 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line123>
> >
> >     This docstring doesn't reflect the actual return type.
> >     
> >     I find boolean values are always tricky as they can require a lot of non-local
context to interpret. How about we just use the previous return type `[(ContainerID, Option(ContainerStatus)]`?

My original intention is to avoid leaking the `isRunningContainer` workaround into the caller.
But since we have to expose whether the container is running or not, and since the `isRunningContainer`
is only used once in recovery, the leakage doesn't seem that bad. I'll change it back to what
you suggested.


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 184-185 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line184>
> >
> >     In Mesos we typically expand types unless they are already mentioned somewhere
in the statement or are nested types (e.g., iterators).
> >     ````
> >     foreach (const CSIPluginContainerInfo::Service& service, services) {
> >       foreach (const CSIPluginContainerInfo& container, info.containers()) {
> >         ...
> >       }
> >     }
> >     ````

This is actually the debate around `auto` ;) But I don't have a strong opinion on this and
the types seems less likely to change, I'll take your suggestion.


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 186-191 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line186>
> >
> >     Let's either add a comment here why it is always okay to only ever store the
first found service, or maybe even better separate input validation from setting up our own
state.

This is the semantics defined in `mesos.proto`. I'll add a comment for now and we can think
if we want to change the semantic.


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 316 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line316>
> >
> >     Let's just use the `static` function like we had previously,
> >     ```
> >     static ContainerID getContainerId(
> >         const CSIPluginInfo& info,
> >         const CSIPluginContainerInfo& container);
> >     ```

We'll have to pass in `containerPrefix` as well. Do you have a reason that makes you prefer
a static function, given that this is already in the implementation file so won't be leaked?


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 332-338 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line332>
> >
> >     Maybe the following instead?
> >     ````
> >     auto it = std::find_if(
> >         info.containers().begin(),
> >         info.containers().end(),
> >         [this, containerId](const CSIPluginContainerInfo& info) {
> >           return getContainerId(info) == containerId;
> >         });
> >         
> >     return it != info.containers().end()
> >              ? *it
> >              : Option<CSIPluginContainerInfo>::none();
> >     ````

I actually thought about this before. But I'm not sure if this is more readable than the current
one. WDYT?


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 389 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line389>
> >
> >     As `result` has a different type than the function return value we could `move`
here.

Hmm I mistakenly thought RVO would take care of this. Thanks!


> On March 19, 2019, 1:12 p.m., Benjamin Bannier wrote:
> > src/csi/service_manager.cpp
> > Lines 614 (patched)
> > <https://reviews.apache.org/r/70168/diff/6/?file=2132460#file2132460line614>
> >
> >     Should we have a metric for the start action as well?

It seems not adding much value to me. The start action count itself doesn't provide much value
unless comparing it with the termination count, which could be used to deduce the number of
running containers. If that's the purpose, then maybe it's more direct to just add a metrics
for running containers.

However, the number of running containers is tightly coupled with whether the RP would launch
a single plugin container (unified CSI deployment) or two containers (split-component deployment).
And that means potentially the monitoring configuration using this metric would be couple
with the RP configs, which seems not good. The termination count itself seems sufficient to
detect plugin failures.

We can discuss if this is necessary in another ticket.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70168/#review213774
-----------------------------------------------------------


On March 12, 2019, 6:59 p.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70168/
> -----------------------------------------------------------
> 
> (Updated March 12, 2019, 6:59 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9632
>     https://issues.apache.org/jira/browse/MESOS-9632
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `ServiceManager` is agnostic to CSI versions, so can be used to
> manage plugin containers for both CSI v0 and v1 plugins. Most of its
> logic are adapted from the SLRP code.
> 
> We also separate the CSI plugin metrics from SLRP metrics object so the
> metrics can be accessed by `ServiceManager`. However, we do not make
> `ServiceManager` own the CSI plugin metrics object because eventually we
> would like to decouple metrics lifecycles from SLRP lifecycles.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3397c3b1d4e8a7900b2e5f870679cc7aa30b4be2 
>   src/Makefile.am d451d7cabe3bf5d4f5039cfac5de1b03ef891d07 
>   src/csi/metrics.hpp PRE-CREATION 
>   src/csi/metrics.cpp PRE-CREATION 
>   src/csi/service_manager.hpp PRE-CREATION 
>   src/csi/service_manager.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70168/diff/6/
> 
> 
> Testing
> -------
> 
> Testing done later in chain.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message