From reviews-return-91552-apmail-mesos-reviews-archive=mesos.apache.org@mesos.apache.org Tue Apr 7 18:21:04 2020 Return-Path: X-Original-To: apmail-mesos-reviews-archive@minotaur.apache.org Delivered-To: apmail-mesos-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by minotaur.apache.org (Postfix) with SMTP id 254A719B39 for ; Tue, 7 Apr 2020 18:21:04 +0000 (UTC) Received: (qmail 29514 invoked by uid 500); 7 Apr 2020 18:21:03 -0000 Delivered-To: apmail-mesos-reviews-archive@mesos.apache.org Received: (qmail 29489 invoked by uid 500); 7 Apr 2020 18:21:03 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 29477 invoked by uid 99); 7 Apr 2020 18:21:03 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Apr 2020 18:21:03 +0000 From: GitBox To: reviews@mesos.apache.org Subject: [GitHub] [mesos] cf-natali commented on a change in pull request #355: Handle EBUSY when destroying a cgroup. Message-ID: <158628366344.5554.6921993188300531318.gitbox@gitbox.apache.org> References: In-Reply-To: Date: Tue, 07 Apr 2020 18:21:03 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit cf-natali commented on a change in pull request #355: Handle EBUSY when destroying a cgroup. URL: https://github.com/apache/mesos/pull/355#discussion_r405018860 ########## File path: src/linux/cgroups.cpp ########## @@ -696,13 +696,24 @@ Try remove(const string& hierarchy, const string& cgroup) string path = path::join(hierarchy, cgroup); // Do NOT recursively remove cgroups. - Try rmdir = os::rmdir(path, false); - if (rmdir.isError()) { - return Error( + // TODO The retry was added as a fix for kernel bug + // https://lkml.org/lkml/2020/1/15/1349 + // where calling rmdir on a seemingly empty cgroup can fail + // with EBUSY while some tasks are exiting. + auto delay = Milliseconds(1); + for (auto retry = 10; ;) { + Try rmdir = os::rmdir(path, false); + if (!rmdir.isError()) { + return rmdir; + } else if (retry > 0) { + os::sleep(delay); Review comment: Hm, I'm still confused: in your version, you capture `retry` and `delay` in the lambdas by value, so their modifications are lost: ``` return loop( // NOTE: copy `delay` by value in the first lambda function. [=]() mutable { auto timeout = process::after(delay); delay *= 2; return timeout; }, // NOTE: copy `retry` by value in the second lambda function. [=](const Nothing&) mutable -> Future> { [...] } else if ((errno == EBUSY) && (retry > 0)) { --retry; return process::Continue(); ``` I think that with this, `delay` and `retry` will *not* be updated from one iteration to the next: so basically we will keep retrying the `rmdir` every ms forever. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org With regards, Apache Git Services