cf-natali opened a new pull request #355: Handle EBUSY when destroying a cgroup.
URL: https://github.com/apache/mesos/pull/355
It's a workaround for kernel bugs
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel/cgroup/cgroup.c?id=9c974c77246460fa6a92c18554c3311c8c83c160
and
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel/cgroup/cgroup.c?id=c03cd7738a83b13739f00546166969342c8ff014
Fixes MESOS-10107.
@abudnik
> Does the workaround work reliably after changing the initial delay and retry count
to the values taken from libcontainerd (10ms and 5)?
Yes, however I chose 1ms and 10 for two reasons:
- this possibly yields lower latency
- more importantly, while doing an strace I can see that it can take sometimes up to ver
100-200ms for rmdir to succeed:
```
[pid 1965] 13:22:36.021260 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000017>
[pid 1965] 13:22:36.022604 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000018>
[pid 1965] 13:22:36.024807 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000080>
[pid 1965] 13:22:36.029116 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000466>
[pid 1965] 13:22:36.037990 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000190>
[pid 1965] 13:22:36.054528 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000038>
[pid 1965] 13:22:36.086874 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= -1 EBUSY (Périphérique ou ressource occupé) <0.000029>
[pid 3225] 13:22:36.127365 +++ killed by SIGKILL +++
[pid 1965] 13:22:36.151151 rmdir("/sys/fs/cgroup/freezer/mesos/b99efad6-b9eb-43bd-8242-29a2b321dd07")
= 0 <0.000114>
```
And 10ms with 5 retries only 320ms (10 * 2**5), so I'd rather have a bit more margin.
> Should we retry only if `::rmdir()` returns EBUSY errno error?
Definitely - I wanted to do that but I'm not sure what's the best way to do it: is there
a way to access `errno` from `Try<Nothing> rmdir` or can I just assume that the global
`errno` is preserved and access it directly?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
|