> On July 29, 2020, 11:27 p.m., Qian Zhang wrote:
> > src/slave/csi_server.cpp
> > Lines 233-235 (patched)
> > <https://reviews.apache.org/r/72716/diff/2/?file=2236468#file2236468line233>
> >
> > I am just curious what would happen if any of the initialization logic fail,
how will the failure be propogated back?
>
> Greg Mann wrote:
> I updated the server so that now `start()` returns a future associated with the initialization.
>
> Qian Zhang wrote:
> I see. And I guess `CSIServer::start()` will be called in `Slave::registered` and
`Slave::reregistered`, right? I am just wondering how we are going to handle the returned
future there. Are we going to register an `onAny` callback and log an error message if it
is a failed future?
>
> Greg Mann wrote:
> Yea I think we have to decide how to handle failures of CSI server initialization.
I might propose a timeout in the agent, after which we log an error? And we could provide
a task status message perhaps when task launches fail because the CSI server failed to initialize?
>
> In any case, I think the interface offered by the current patch set will be sufficient
to let us handle the failed initialization case, WDYT?
I took a look at the code of local resource provider daemon and I found it just log an error
message in its `start` method:
https://github.com/apache/mesos/blob/1.10.0/src/slave/slave.cpp#L1740:L1742
https://github.com/apache/mesos/blob/1.10.0/src/resource_provider/daemon.cpp#L188:L191
Do you think if we can do the similar?
> On July 29, 2020, 11:27 p.m., Qian Zhang wrote:
> > src/slave/csi_server.cpp
> > Lines 244-245 (patched)
> > <https://reviews.apache.org/r/72716/diff/2/?file=2236468#file2236468line244>
> >
> > Do we have to use `started` and `initializationCallbacks`? Can we do the similar
with https://github.com/apache/mesos/blob/1.10.0/src/csi/v1_volume_manager.cpp#L1336 ?
>
> Greg Mann wrote:
> The reason it's more complicated here is because we may add more "initialization
logic" after server construction if publish/unpublish calls are made before the server is
started. So we need an approach which will allow us to add more function calls which are executed
during startup. I explored another approach while coding but this is what I ended up settling
on, but I'm happy to explore other options if we can think of something better.
>
> Qian Zhang wrote:
> I see currently you put the "initialization logic" (i.e. generate auth token and
intialize plugins) in the constructor of `CSIServerProcess`. Can we instead do that in `CSIServerProcess::start()`
and do the following in `CSIServer::start()`.
> ```
> Future<Nothing> CSIServer::start()
> {
> started = process::dispatch(process.get(), &CSIServerProcess::start);
> return started;
> }
> ```
>
> And then in `CSIServer::publishVolume` and `CSIServer::unpublishVolume` we could
do the following:
> ```
> Future<string> CSIServer::publishVolume(
> const Volume::Source::CSIVolume& volume)
> {
> return started
> .then(process::defer(
> process.get(),
> &CSIServerProcess::publishVolume,
> volume));
> }
> ```
> So any publish and unpublish volume calls can only be executed after CSI server is
started. HDYT?
>
> Greg Mann wrote:
> The reason I didn't follow this approach is that it doesn't guarantee that the order
of publish/unpublish calls would be maintained when initializing, but maybe that's OK?
>
> I think that in our current implementation of `Future` in libprocess the order would
be maintained, but this isn't guaranteed by the interface. Am I being too paranoid here? :-)
> it doesn't guarantee that the order of publish/unpublish calls would be maintained when
initializing
Could you please elaborate a bit why this is a problem? In volume manager, I see we already
have a sequence per volume to make sure all CSI gRPC calls on the same volume are processed
in a sequential order.
- Qian
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72716/#review221405
-----------------------------------------------------------
On Aug. 4, 2020, 2:58 a.m., Greg Mann wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72716/
> -----------------------------------------------------------
>
> (Updated Aug. 4, 2020, 2:58 a.m.)
>
>
> Review request for mesos, Andrei Budnik and Qian Zhang.
>
>
> Bugs: MESOS-10163
> https://issues.apache.org/jira/browse/MESOS-10163
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added implementation of the CSI server.
>
>
> Diffs
> -----
>
> src/CMakeLists.txt 4e15e3d99aa2cce2403fe07e762fef2fb4a27dea
> src/Makefile.am 447db323875e4cad46000977f4a61600baff8f89
> src/slave/csi_server.cpp PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/72716/diff/4/
>
>
> Testing
> -------
>
> Details at the end of this chain.
>
>
> Thanks,
>
> Greg Mann
>
>
|