mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Review Request 72716: Added implementation of the CSI server.
Date Mon, 10 Aug 2020 02:33:40 GMT


> On July 29, 2020, 11:27 p.m., Qian Zhang wrote:
> > src/slave/csi_server.cpp
> > Lines 233-235 (patched)
> > <https://reviews.apache.org/r/72716/diff/2/?file=2236468#file2236468line233>
> >
> >     I am just curious what would happen if any of the initialization logic fail,
how will the failure be propogated back?
> 
> Greg Mann wrote:
>     I updated the server so that now `start()` returns a future associated with the initialization.
> 
> Qian Zhang wrote:
>     I see. And I guess `CSIServer::start()` will be called in `Slave::registered` and
`Slave::reregistered`, right? I am just wondering how we are going to handle the returned
future there. Are we going to register an `onAny` callback and log an error message if it
is a failed future?
> 
> Greg Mann wrote:
>     Yea I think we have to decide how to handle failures of CSI server initialization.
I might propose a timeout in the agent, after which we log an error? And we could provide
a task status message perhaps when task launches fail because the CSI server failed to initialize?
>     
>     In any case, I think the interface offered by the current patch set will be sufficient
to let us handle the failed initialization case, WDYT?
> 
> Qian Zhang wrote:
>     I took a look at the code of local resource provider daemon and I found it just log
an error message in its `start` method:
>     https://github.com/apache/mesos/blob/1.10.0/src/slave/slave.cpp#L1740:L1742
>     https://github.com/apache/mesos/blob/1.10.0/src/resource_provider/daemon.cpp#L188:L191
>     
>     Do you think if we can do the similar?
> 
> Greg Mann wrote:
>     I was worried about maintaining ordering of calls made before startup, but as we
discussed offline, I think I was being too paranoid :)
>     
>     I'm switching the patch to the kind of approach used in the daemon.
> 
> Greg Mann wrote:
>     lol sorry I replied on the wrong issue :)
>     
>     I think we can log an error message as well, but it does also seem appropriate to
me to return a future associated with the startup. We can decide what to do with that future
in the agent, but might as well return it?
> 
> Greg Mann wrote:
>     I decided not to log an error in the server, I think the proper place to log would
be at the callsite where we handle the future returned by `start()`, WDYT?
> 
> Qian Zhang wrote:
>     So do you mean we should log an error in `Slave::registered` and `Slave::reregistered`?
> 
> Greg Mann wrote:
>     It depends on how exactly we initialize the CSI server in the agent; I'm not sure
that it would necessarily be in those functions. In any case, since `start()` returns a Future
I don't think we need to log anything here in the server, unless we think there are multiple
classes of failures, some of which we would want to crash for and some of which we would want
to just log?

IMHO we need to call `CSIServer::start` in `Slave::registered` and `Slave::reregistered` (i.e.
right after the agent's state is changed to `RUNNING`), do we have any other options?


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72716/#review221405
-----------------------------------------------------------


On Aug. 7, 2020, 3 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72716/
> -----------------------------------------------------------
> 
> (Updated Aug. 7, 2020, 3 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Qian Zhang.
> 
> 
> Bugs: MESOS-10163
>     https://issues.apache.org/jira/browse/MESOS-10163
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added implementation of the CSI server.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 4e15e3d99aa2cce2403fe07e762fef2fb4a27dea 
>   src/Makefile.am 447db323875e4cad46000977f4a61600baff8f89 
>   src/slave/csi_server.hpp 17882e1be5a6c38ca34d7b50d4a6041530e8908c 
>   src/slave/csi_server.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/72716/diff/5/
> 
> 
> Testing
> -------
> 
> Details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message