mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Review Request 51406: Updated launch helper to optionally spawn an 'init' process on linux.
Date Sat, 24 Sep 2016 19:58:07 GMT


> On Sept. 24, 2016, 1:33 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/launch.cpp, line 140
> > <https://reviews.apache.org/r/51406/diff/7/?file=1509426#file1509426line140>
> >
> >     Looks like sprintf is not async safe in glibc:
> >     http://stackoverflow.com/questions/14573000/print-int-from-signal-handler-using-write-or-async-safe-functions
> >     
> >     cerr << below is not async safe either.
> >     
> >     I think it's highly unlikely that we'll hit the async safe problem here because
most of the time, our code will be in syscall. So I suggest we simply this function and add
a TODO about async safety (we try to be async safe, but if it's too hard, use a simple way):
> >     
> >     ```
> >     Try<Nothing> write = os::write(
> >         containerStatusFd.get(),
> >         std::to_string(status));
> >     
> >     if (write.isError()) {
> >       os::write(
> >           STDERR_FILENO,
> >           "Failed to write container status '" + statusString +
> >           "': " + ::strerror(errno));
> >     }
> >     ```
> >     
> >     I'd not do the `_exit` here because this helper is meant to be writting the
status. Leave the `_exit` to the caller.
> 
> Kevin Klues wrote:
>     Yeah, that's my bad. I didn't look closely enough at `sprintf()` and assumed it was
async signal safe since it takes a pointer to the destination where to write the result. I
forgot that it does buffered I/O under the hood though. Doh!
>     
>     The `cerr <<` was just an oversight. Obviously this is not async signal safe!
>     
>     Regarding the exit... If we don't want to do an exit here, then the function should
return a `bool` so the caller can decide what to do.

Looks like all callers do not care. I'll keep it as void for now.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51406/#review150266
-----------------------------------------------------------


On Sept. 23, 2016, 8:59 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51406/
> -----------------------------------------------------------
> 
> (Updated Sept. 23, 2016, 8:59 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6088
>     https://issues.apache.org/jira/browse/MESOS-6088
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously the 'mesos-containerizer launch' binary would simply exec
> into the actual command we wanted to launch after doing some set of
> preperatory work. The problem with this approach, however, is that
> this gives us no opportunity to checkpoint the wait status of the
> command so the agent can recover it in cases where it is offline at
> the time the command completes.
> 
> To compensate for this, we now allow 'mesos-containerizer launch' to
> take an optional '--runtime_directory' argument, which indicates where
> to checkpint the status of the launched command (as returned by
> 'waitpid()'). In order to do this checkpointing, however, we cannot
> simply exec into the command anymore. Instead we now fork-exec the
> command and reap it once it completes. We then checkpoint its status
> and return it as our own. We call the original process the 'init'
> process of the container and the fork-exec'd process its child.  As a
> side effect of doing things this way, we also have to be careful to
> forward all signals sent to the init process down to the child.
> 
> In a subsequent commit we will update the mesos containerizer to use
> this new functionality.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/launch.hpp 859990cb85e9e8c06400397256cfc512f0811800 
>   src/slave/containerizer/mesos/launch.cpp 777121cffaa5fc76adfefc1e617409d997c7aac8 
> 
> Diff: https://reviews.apache.org/r/51406/diff/
> 
> 
> Testing
> -------
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message