mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Mann" <g...@mesosphere.io>
Subject Re: Review Request 37821: Join threads in libprocess when shutting down.
Date Thu, 10 Sep 2015 21:25:01 GMT


> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > I've made a few nit comments below but I have some higher-level questions.
> > 
> >   (1) In this patch, when the destructor of `ProcessManager` is invoked we immediately
start to ignore messages. It's not obvious to me that this is necessary or desired. Could
we just enqueue the `TerminateEvent` by calling `process::terminate(process, false)`, let
it race against other messages that may be coming in, and ignore anything that arrives after
`TerminateEvent`?
> >   (2) What are the implications of shutting down the event loop via `EventLoop::stop()`?
This would be useful to know to learn about the required order between process termination,
event loop shutdown, and thread pool joins.

Thanks for the review mpark!!

Regarding (2), `EventLoop::stop()` calls methods from libevent/libev that terminate the event
loop after all active callbacks have completed. This means that we can have messages coming
in from a socket after this call is made.

Regarding (1), I was worried about messages coming in on sockets preventing the queues from
draining, but as I think Joris mentioned in our previous discussion, if messages are coming
in fast enough to prevent draining, then they're coming in too fast for just about anything
to happen, so it's not worth worrying about. Further, it's possible that an existing process
needs to receive a socket message in order to terminate, so they should be allowed. I think
that we can do away with `draining_queue`.

Once the event loop is stopped, we won't have any new socket messages coming in, but the question
is if we still need a boolean check in `ProcessManager::handle()` to make sure that any final
callbacks on the event loop don't attempt to queue up events. I think this would only be necessary
if it's possible for a socket message to spawn a new process directly, and I'm not yet sure
if that is the case? If the socket message attempts to queue up an event on an existing process,
the run queue will be empty, so it won't be possible, but if it adds a process to the run
queue, we have a problem. I've removed the boolean check from `ProcessManager::handle()` for
now.


> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > 3rdparty/libprocess/src/libev.cpp, line 30
> > <https://reviews.apache.org/r/37821/diff/6/?file=1064284#file1064284line30>
> >
> >     Is there a reason why this was moved up? Seems like `async_shutdown` could've
been added below this, at its original location?

This was moved up because, strictly speaking, the `async_watcher` declaration doesn't match
the comment below, which states that we will "Define the initial values for all of the declarations
made in libev.hpp", so it seemed more appropriate to place before the comment. If this order
is awkward the watcher declarations could be moved.


> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > 3rdparty/libprocess/src/process.cpp, lines 2143-2145
> > <https://reviews.apache.org/r/37821/diff/6/?file=1064286#file1064286line2143>
> >
> >     Could you explain why we want to start ignoring messages here?
> >     
> >     Below, we do `process::terminate(process, false);` which purposely does not
inject the `TerminateEvent` in order to allow the actor to process its queue.
> >     
> >     This is great, but wouldn't we be dropping messages that were sent between 
`draining_queue.store(true);` and when the `TerminateEvent` actually gets enqueued at the
back? Is the decision that this doesn't matter?

See my first comment above.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37821/#review98329
-----------------------------------------------------------


On Sept. 8, 2015, 5:54 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37821/
> -----------------------------------------------------------
> 
> (Updated Sept. 8, 2015, 5:54 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Joris Van Remoortere,
Joseph Wu, and Michael Park.
> 
> 
> Bugs: MESOS-3158
>     https://issues.apache.org/jira/browse/MESOS-3158
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Join threads in libprocess when shutting down.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/event_loop.hpp 36a4cd2b1ff59f6922173ad17115bf80cc3c8f30 
>   3rdparty/libprocess/src/libev.cpp 97a2694f9b10bc61841443b21f4f96055493e840 
>   3rdparty/libprocess/src/libevent.cpp ee7906470069b0391dde7cd685b1d4eb3a158c03 
>   3rdparty/libprocess/src/process.cpp 0e5394acff16376809918d583d7aee582cc6da54 
> 
> Diff: https://reviews.apache.org/r/37821/diff/
> 
> 
> Testing
> -------
> 
> After configuring with both "../configure" and "../configure --enable-libevent --enable-ssl":
> 
> make check
> 
> 
> Also, to check for race conditions related to the initialization/shutdown of libprocess,
try something like:
> 
> for n in {1..1000}; do echo $n; 3rdparty/libprocess/tests --gtest_filter=ProcessTest.Spawn;
done
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message