From reviews-return-68795-apmail-mesos-reviews-archive=mesos.apache.org@mesos.apache.org Tue Nov 14 18:47:03 2017 Return-Path: X-Original-To: apmail-mesos-reviews-archive@minotaur.apache.org Delivered-To: apmail-mesos-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B9E5610499 for ; Tue, 14 Nov 2017 18:47:03 +0000 (UTC) Received: (qmail 47094 invoked by uid 500); 14 Nov 2017 18:47:03 -0000 Delivered-To: apmail-mesos-reviews-archive@mesos.apache.org Received: (qmail 47051 invoked by uid 500); 14 Nov 2017 18:47:03 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 46868 invoked by uid 99); 14 Nov 2017 18:47:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Nov 2017 18:47:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7F508C895C; Tue, 14 Nov 2017 18:47:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3 X-Spam-Level: *** X-Spam-Status: No, score=3 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id wn8ApfjXKqiD; Tue, 14 Nov 2017 18:47:00 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 639545FDE7; Tue, 14 Nov 2017 18:47:00 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B5F98E256D; Tue, 14 Nov 2017 18:46:59 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id B0C0FC4060E; Tue, 14 Nov 2017 18:46:59 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3393540941311130554==" MIME-Version: 1.0 Subject: Re: Review Request 63555: Publish resource provider resources before container launch or update. From: Chun-Hung Hsiao To: Jie Yu , Joseph Wu , Gilbert Song , Jan Schlicht Cc: Chun-Hung Hsiao , mesos Date: Tue, 14 Nov 2017 18:46:59 -0000 Message-ID: <20171114184659.29249.67638@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Chun-Hung Hsiao X-ReviewGroup: mesos X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/63555/ X-Sender: Chun-Hung Hsiao References: <20171109125211.37791.52486@reviews-vm2.apache.org> In-Reply-To: <20171109125211.37791.52486@reviews-vm2.apache.org> Reply-To: Chun-Hung Hsiao X-ReviewRequest-Repository: mesos --===============3393540941311130554== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Nov. 9, 2017, 12:52 p.m., Jan Schlicht wrote: > > src/slave/slave.cpp > > Lines 6816-6822 (patched) > > > > > > So this will try to publish all RP resources of all executor of all frameworks? Or am I missing something here? I'd expect that only the RP resources of the task/executor that's about to get started should be published. Hence `resourceProviderManager->publish(info.id(), executor->allocatedResources())` should be enough. > > Chun-Hung Hsiao wrote: > For resources with unique identifiers (such as storage volumes in SLRP), it is enough to publish resources about to use in the executor. However, for resources without any identifier, such as the default resources of an agent (CPUs, memory, disk), since we don't do unpublish, there's no way for resource providers to know what the allocations are. Image that the resource provider keeps receiving "publish 2 CPUs", should it keeps publishing 2 new CPUs every time? This is the motivation to have a "ensure-all" semantics for `PUBLISH`. Given the semantics, we need to publish all RP resources for all executors. > > An alternative is that make the resource provider manager to be aware of executors, then it can keep track of the resources used by each executor, then compute what the total resourcse should be. But then 1) this is for agent only so I think this is not appropriate if we want to have the same manager code in both agent and master; 2) the agent needs to notify the manager that an executor is finished. > > Jan Schlicht wrote: > Thanks for the explanation! I don't understand though, why a resource provider should be asked to "publish 2 CPUs". In my understanding "publish" is only meant for resource provider resources, thus agent resources should be part of these operations. Or are there use cases where a resource provider might need to know the agent resources of a task? Sorry for not being clear. I'm taking default resources (CPUs) as an example since we have some vision to manage the default resources with a default resource provider in the future. My point is that if we want to support RP that provider resources with only types and quantities but no IDs, then unless we do some refactoring to support `UNPUBLISH`, we need to tell the total amount of resources used every time we want to ask the RP to publish. - Chun-Hung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63555/#review190572 ----------------------------------------------------------- On Nov. 4, 2017, 1:55 a.m., Chun-Hung Hsiao wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63555/ > ----------------------------------------------------------- > > (Updated Nov. 4, 2017, 1:55 a.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, Joseph Wu, and Jan Schlicht. > > > Bugs: MESOS-7550 > https://issues.apache.org/jira/browse/MESOS-7550 > > > Repository: mesos > > > Description > ------- > > `Slave::publishAllocatedResources()` will compute the total allocated > resources for all currently running executor containers, and takes an > `extra` argument for resources that will be used by the executor that > is about to launch, then sums them up and asks the resource provider > manager to publish the resources. > > > Diffs > ----- > > src/slave/slave.hpp df1b0205124555dcb6a0efa5c237f5e77fa2bdf7 > src/slave/slave.cpp 337083dbe60bba2d3773b785bdceeaf0b8fcd070 > > > Diff: https://reviews.apache.org/r/63555/diff/1/ > > > Testing > ------- > > make check > > > Thanks, > > Chun-Hung Hsiao > > --===============3393540941311130554==--