mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Klues <klue...@gmail.com>
Subject Re: Review Request 52446: Updated 'destroy()' to checkpoint termination state of nested container.
Date Fri, 30 Sep 2016 23:08:32 GMT


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 711-713
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line711>
> >
> >     Can you introduce a helper in paths.hpp|cpp:
> >     
> >     ```
> >     paths::getContainerTerminationPath(...);
> >     ```

Sure. None of the other files have this currently though (i.e. pid, status).
Maybe we should commit this patch without this helper and then do a separate patch that adds
this helper for all checkpointed files.

I will hold off on this one until you confirm.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1699-1705
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line1699>
> >
> >     Let's do that only for nested container.

It should still work as is (since top-level containers will simply never have this file),
but it's probably better to be explicit. I'll change it.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 2197-2203
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line2197>
> >
> >     NO need for this? I think checkpoint supports a Protobuf Message directly.

Cool. I didn't realize this.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, line 2211
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line2211>
> >
> >     Totally orthogonol issue: I am wondering if we should prevent people from creating
nested container under a legacy container?

It would definitely be a bug if they did because no runtime directory will exist for legacy
containers. It would be a simple check in launch to verify this.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/tests/containerizer/nested_container_tests.cpp, line 32
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line32>
> >
> >     Please avoid this. Use 'using' explicitly

Yeah, this must have been copied in. I will be more careful next time.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/tests/containerizer/nested_container_tests.cpp, line 59
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line59>
> >
> >     s/false/true/

Are we making this tru for all tests going forward? I see a mix of false/true in existing
tests.


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/tests/containerizer/nested_container_tests.cpp, lines 95-97
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line95>
> >
> >     No need for this.

Copy and paste leftovers...


> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote:
> > src/tests/containerizer/nested_container_tests.cpp, line 116
> > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line116>
> >
> >     This looks redundant? The fact the first 'wait' returns means that container
has been destroyed.

Yeah, it is. I only wrote it to be explicit (it actually always ends up logging that the container
is already destroyed). Maybe it's more confusing this way though.


- Kevin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52446/#review151092
-----------------------------------------------------------


On Sept. 30, 2016, 11:08 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52446/
> -----------------------------------------------------------
> 
> (Updated Sept. 30, 2016, 11:08 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, and Vinod Kone.
> 
> 
> Bugs: MESOS-6287
>     https://issues.apache.org/jira/browse/MESOS-6287
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, when a nested container was being destroyed, it's runtime
> directory was being deleted (just the same as a top-level container).
> However, this meant that calling 'wait()' on a previously terminated
> nested container would return 'None()' since its status had already
> been reaped. The problem with this, however, is that this will cause
> an entire pod to be terminated since it thinks that the container it
> is calling wait on cannot be found.
> 
> To fix this, we leave the runtime directory of nested containers
> around until their top-level containers are destroyed. Additionally,
> we checkpiont the entire termination state of the nested container
> into its runtime directory, so that subsequent calls to 'wait()' can
> retrieve the full termination state for the lifetime of the top-level
> container.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am f093000e0282a8d5ac17e7ba33711690ccdfe68a 
>   src/slave/containerizer/mesos/containerizer.cpp 522d2c37229b07b66a0824c3e246c32f8d803b10

>   src/slave/containerizer/mesos/paths.hpp 1051c219c55253d03199045b6d2f43377ae93e53 
>   src/slave/containerizer/mesos/paths.cpp 6c6b4dcc39fbc00485552caab88457918e622e08 
>   src/tests/containerizer/nested_container_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52446/diff/
> 
> 
> Testing
> -------
> 
> GTEST_FILTER="" make -j check
> sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message