mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 40266: Libprocess Reinit: Cleanup SocketManager alongside ProcessManager.
Date Fri, 05 Aug 2016 21:43:22 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review144989
-----------------------------------------------------------




3rdparty/libprocess/src/process.cpp (lines 2407 - 2413)
<https://reviews.apache.org/r/40266/#comment211141>

    I discovered while running my SSL scheduler test that it's possible for new processes
to be spawned in between the destruction of `gc` and the stopping of the event loop - see
the gist [here](https://gist.github.com/greggomann/4e1d6a4101d4a3c52a5d9ea2571a043b). Just
before the backtrace, you can see some debug output I added to indicate when `gc` is deleted
and set to `nullptr`.
    
    In this case, it looks like the scheduler process was attempting to reopen a `Connection`;
the GC's `manage()` method is dispatched to manage the new `ConnectionProcess`, and when the
dispatch calls the GC process's `self()` and attempts to construct a new `PID` using the `gc`
pointer we get a segfault.
    
    To avoid this, perhaps we should have a check in `spawn` which refuses to spawn new processes
while libprocess is being finalized/reinitialized? It seems to me that some processes may
need to spawn during termination, so maybe enforcing that constraint after `terminate_all()`
would make sense?


- Greg Mann


On July 29, 2016, 11:53 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> -----------------------------------------------------------
> 
> (Updated July 29, 2016, 11:53 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Joris Van Remoortere, and Vinod
Kone.
> 
> 
> Bugs: MESOS-3910
>     https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, 
> which requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
>   `ProcessManager::finalize` due to what happens during cleanup.
> * The future from `__s__->accept()` must be explicitly discarded as 
>   libevent does not detect a locally closed socket.
> * Terminating `HttpProxy`s must close the associated socket.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/process.cpp 7f331b812de2f0437838f48e0959441c8e04c358 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> -------
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message