From reviews-return-66552-apmail-mesos-reviews-archive=mesos.apache.org@mesos.apache.org Sat Sep 30 00:37:43 2017 Return-Path: X-Original-To: apmail-mesos-reviews-archive@minotaur.apache.org Delivered-To: apmail-mesos-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67A6610CC8 for ; Sat, 30 Sep 2017 00:37:43 +0000 (UTC) Received: (qmail 77807 invoked by uid 500); 30 Sep 2017 00:37:43 -0000 Delivered-To: apmail-mesos-reviews-archive@mesos.apache.org Received: (qmail 77770 invoked by uid 500); 30 Sep 2017 00:37:43 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 77756 invoked by uid 99); 30 Sep 2017 00:37:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Sep 2017 00:37:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8F093180CE4; Sat, 30 Sep 2017 00:37:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.001 X-Spam-Level: **** X-Spam-Status: No, score=4.001 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, KAM_INFOUSMEBIZ=0.75, KAM_LAZY_DOMAIN_SECURITY=1, KAM_LOTSOFHASH=0.25, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id yANBAcIGDX2u; Sat, 30 Sep 2017 00:37:40 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id F1C405F295; Sat, 30 Sep 2017 00:37:39 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3B775E00DF; Sat, 30 Sep 2017 00:37:39 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id 3A681C413BC; Sat, 30 Sep 2017 00:37:37 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============8927573482160758176==" MIME-Version: 1.0 Subject: Re: Review Request 61473: Do not kill non partition aware tasks. From: Jiang Yan Xu To: Vinod Kone , Jiang Yan Xu , James Peach Cc: Megha Sharma , mesos Date: Sat, 30 Sep 2017 00:37:36 -0000 Message-ID: <20170930003736.860.60881@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Jiang Yan Xu X-ReviewGroup: mesos X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/61473/ X-Sender: Jiang Yan Xu References: <20170924224632.35376.85191@reviews-vm2.apache.org> In-Reply-To: <20170924224632.35376.85191@reviews-vm2.apache.org> Reply-To: Jiang Yan Xu X-ReviewRequest-Repository: mesos --===============8927573482160758176== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61473/#review186737 ----------------------------------------------------------- As I commented on the JIRA, we should probably bring the work for MESOS-6406 into this JIRA because this JIRA wouldn't be complete without it. It certainly should be a different patch though. For this patch we should also update the comments above `PARTITION_AWARE` API to reflect the change. https://github.com/apache/mesos/blob/6eefc685ccf304d0fb8ed4ff9bc314197d77f078/include/mesos/mesos.proto#L336 Also please search the docs for necessary changes. src/tests/partition_tests.cpp Line 526 (original), 526 (patched) s/still/are still/ src/tests/partition_tests.cpp Line 708 (original), 708 (patched) s/the not PARTITION_AWARE framework/the non-PARTITION_AWARE framework/ src/tests/partition_tests.cpp Lines 728-729 (original), 728-729 (patched) Update comments? src/tests/partition_tests.cpp Lines 2045 (patched) Perhaps add a comment: `// The agent may resend status updates.` - Jiang Yan Xu On Sept. 24, 2017, 3:46 p.m., Megha Sharma wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61473/ > ----------------------------------------------------------- > > (Updated Sept. 24, 2017, 3:46 p.m.) > > > Review request for mesos, James Peach, Vinod Kone, and Jiang Yan Xu. > > > Bugs: MESOS-7215 > https://issues.apache.org/jira/browse/MESOS-7215 > > > Repository: mesos > > > Description > ------- > > Master will not kill the tasks for non-Partition aware frameworks > when an unreachable agent re-registers with the master. > Master used to send a ShutdownFrameworkMessages to the agent > to kill the tasks from non partition aware frameworks including the > ones that are still registered which was problematic because the offer > from this agent could still go to the same framework which could then > launch new tasks. The agent would then receive tasks of the same > framework and ignore them because it thinks the framework is shutting > down. The framework is not shutting down of course, so from the master > and the scheduler’s perspective the task is pending in STAGING forever > until the next agent reregistration, which could happen much later. > This commit fixes the problem by not shutting down the non-partition > aware frameworks on such an agent. > > > Diffs > ----- > > src/master/http.cpp 28d0393fb5962df4d731521265efd81a54e1e655 > src/master/master.hpp 05f88111afb4fa0e2baf57106e1479914c16a113 > src/master/master.cpp 6d84a26bff970b842b58dfb69dbf232ba5c16a20 > src/tests/partition_tests.cpp 0886f4890ac3fec6f38146946892769a99c3e68f > > > Diff: https://reviews.apache.org/r/61473/diff/7/ > > > Testing > ------- > > make check > > > Thanks, > > Megha Sharma > > --===============8927573482160758176==--