mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Review Request 71965: Cgroups isolator: added support for nested cgroups during launch.
Date Wed, 22 Jan 2020 09:21:14 GMT


> On Jan. 13, 2020, 11:21 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp
> > Line 459 (original), 480 (patched)
> > <https://reviews.apache.org/r/71965/diff/1/?file=2199231#file2199231line480>
> >
> >     I think now we need to do this for nested containers whose `share_cgroups` is
false, because they may need to create cgroups under their own nested cgroups.
> 
> Andrei Budnik wrote:
>     Do we allow creating cgroups under nested cgroups of those nested containers? Since
we do not use a `mesos` separator, the cgroups isolator may pick up those underlying cgroups
during recovery, which is not what expected.
> 
> Qian Zhang wrote:
>     Actually we are not able to disallow that, it is up to the task itself, if the task
wants to create a child cgroup under its own cgroup, then I think it can just do it. But yes
I agree that may bring trouble to recovery, maybe we should use a `mesos` separator so that
the cgroups created by the tasks will not affect our recovery, how do you think?
> 
> Andrei Budnik wrote:
>     After enabling `mesos` separator between cgroups, launching a nested container with
`share_cgroup=false` fails with "No space left on device" error. The root cause of this failure
is described in the comment to our function `cloneCpusetCpusMems` and in this SO article:
https://stackoverflow.com/questions/28348627/echo-tasks-gives-no-space-left-on-device-when-trying-to-use-cpuset
>     
>     
>     `CgroupsIsolatorProcess::prepare()` calls `cgroups::create()` which calls `cloneCpusetCpusMems()`
function.
>     
>     
>     ```
>     // Copies the value of 'cpuset.cpus' and 'cpuset.mems' from a parent
>     // cgroup to a child cgroup so the child cgroup can actually run tasks
>     // (otherwise it gets the error 'Device or resource busy').
>     ...
>     static Try<Nothing> cloneCpusetCpusMems(
>     ```
>     
>     
>     When we call `cgroups::create()` for a nested cgroup, `cloneCpusetCpusMems()` tries
to copy those “cpuset” files from a parent container that has a suffix `<path to parent
container>/mesos` to a child container with suffix `<path to parent container>/mesos/childContainerId`.
Since those "cpuset" files are not filled/initialized for the parent that ends with `/mesos`
suffix, `cloneCpusetCpusMems()` copies those empty files to the child cgroup of `/mesos/childContainerId`.
Therefore, the cgroups isolator fails to assign a PID to the cgroup for the child container
during `isolate()` for the `cpuset` cgroup (for the reason described in the beggining of this
comment).
>     
>     I'm not sure about the best way to fix/workaround this issue. I decided to get rid
of a separator between cgroups. AFAIK, k8s does not use any separator between containers within
the POD.
>     
>     Do you happen to know any better approach to the problem described above?

Good to know!

It seems a bug of `cgroups::create()` to me, I mean if `cgroups::create()` is used to create
a cgroup recursively (like: `mesos/childContainerId`), it should clone `cpuset.cpus` and `cpuset.mems`
recursively too (both `mesos` and `childContainerId`) rather than just clone them for the
leaf cgroup (`childContainerId`), right?

If we do not want to change the code of `cgroups::create()`, then a workaround could be calling
`cgroups::create()` non-recursively, i.e., call this function twice, one for `mesos` and another
for `mesos/childContainerId`. But I would suggest to fix `cgroups::create()` if possible.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71965/#review219227
-----------------------------------------------------------


On Jan. 10, 2020, 5:03 a.m., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71965/
> -----------------------------------------------------------
> 
> (Updated Jan. 10, 2020, 5:03 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Qian Zhang.
> 
> 
> Bugs: MESOS-10076
>     https://issues.apache.org/jira/browse/MESOS-10076
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds support for nested cgroups for nested containers.
> Nested cgroups are created only for a nested container with explicitly
> enabled `share_cgroups` flag. The cgroups isolator stores info about
> nested cgroups in the `infos` class variable, which is used to
> determine whether a nested container has its nested cgroup.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 4bd3d6dad37dee031660c15e957cc36f63e21fcb

>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp b12b73d8e0161d448075378765e77867521de04e

> 
> 
> Diff: https://reviews.apache.org/r/71965/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message