> On July 29, 2020, 3:27 p.m., Qian Zhang wrote: > > src/slave/csi_server.cpp > > Lines 194 (patched) > > > > > > I think we should set this based on the CSI services in the related `CSIPluginInfo` rather than hard code here, e.g. in future we may support both node and controller services for Portworx. > > > > Or maybe we should not have this parameter at all? I mean `ServiceManager` should be able to figure out the support services of the plugin based on the related `CSIPluginInfo`, right? > > Greg Mann wrote: > Good point. Yea we may want to remove the service information from the plugin info, I'm not sure. I'll leave it there for now and just initialize the volume manager with the correct services here. > > Qian Zhang wrote: > Could you plesae elaborate a bit on why we may want to remove the service information from the plugin info? > > And in the latest patch (revision 3), I see you have called `extractServices` to get services when instantiating `ServiceManager` but still hard code `{csi::NODE_SERVICE}` for volume manager, can we just do the same for volume manager? > Could you plesae elaborate a bit on why we may want to remove the service information from the plugin info? I really don't know if there is a good reason to do that, I just mentioned it in passing. I think we can leave the interface as it is for now. I will definitely update the volume manager as you suggested, thanks for catching that! > On July 29, 2020, 3:27 p.m., Qian Zhang wrote: > > src/slave/csi_server.cpp > > Lines 233-235 (patched) > > > > > > I am just curious what would happen if any of the initialization logic fail, how will the failure be propogated back? > > Greg Mann wrote: > I updated the server so that now `start()` returns a future associated with the initialization. > > Qian Zhang wrote: > I see. And I guess `CSIServer::start()` will be called in `Slave::registered` and `Slave::reregistered`, right? I am just wondering how we are going to handle the returned future there. Are we going to register an `onAny` callback and log an error message if it is a failed future? Yea I think we have to decide how to handle failures of CSI server initialization. I might propose a timeout in the agent, after which we log an error? And we could provide a task status message perhaps when task launches fail because the CSI server failed to initialize? In any case, I think the interface offered by the current patch set will be sufficient to let us handle the failed initialization case, WDYT? > On July 29, 2020, 3:27 p.m., Qian Zhang wrote: > > src/slave/csi_server.cpp > > Lines 244-245 (patched) > > > > > > Do we have to use `started` and `initializationCallbacks`? Can we do the similar with https://github.com/apache/mesos/blob/1.10.0/src/csi/v1_volume_manager.cpp#L1336 ? > > Greg Mann wrote: > The reason it's more complicated here is because we may add more "initialization logic" after server construction if publish/unpublish calls are made before the server is started. So we need an approach which will allow us to add more function calls which are executed during startup. I explored another approach while coding but this is what I ended up settling on, but I'm happy to explore other options if we can think of something better. > > Qian Zhang wrote: > I see currently you put the "initialization logic" (i.e. generate auth token and intialize plugins) in the constructor of `CSIServerProcess`. Can we instead do that in `CSIServerProcess::start()` and do the following in `CSIServer::start()`. > ``` > Future CSIServer::start() > { > started = process::dispatch(process.get(), &CSIServerProcess::start); > return started; > } > ``` > > And then in `CSIServer::publishVolume` and `CSIServer::unpublishVolume` we could do the following: > ``` > Future CSIServer::publishVolume( > const Volume::Source::CSIVolume& volume) > { > return started > .then(process::defer( > process.get(), > &CSIServerProcess::publishVolume, > volume)); > } > ``` > So any publish and unpublish volume calls can only be executed after CSI server is started. HDYT? The reason I didn't follow this approach is that it doesn't guarantee that the order of publish/unpublish calls would be maintained when initializing, but maybe that's OK? I think that in our current implementation of `Future` in libprocess the order would be maintained, but this isn't guaranteed by the interface. Am I being too paranoid here? :-) - Greg ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72716/#review221405 ----------------------------------------------------------- On July 31, 2020, 7 p.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72716/ > ----------------------------------------------------------- > > (Updated July 31, 2020, 7 p.m.) > > > Review request for mesos, Andrei Budnik and Qian Zhang. > > > Bugs: MESOS-10163 > https://issues.apache.org/jira/browse/MESOS-10163 > > > Repository: mesos > > > Description > ------- > > Added implementation of the CSI server. > > > Diffs > ----- > > src/CMakeLists.txt bcb5128f0e61af0d541502e4ed833da0487b7792 > src/Makefile.am a89919dd7d5ccbc4c5fa79d9a83616608f84ef49 > src/slave/csi_server.cpp PRE-CREATION > > > Diff: https://reviews.apache.org/r/72716/diff/3/ > > > Testing > ------- > > Details at the end of this chain. > > > Thanks, > > Greg Mann > >