From reviews-return-53254-apmail-mesos-reviews-archive=mesos.apache.org@mesos.apache.org Wed Jan 4 16:10:03 2017 Return-Path: X-Original-To: apmail-mesos-reviews-archive@minotaur.apache.org Delivered-To: apmail-mesos-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EDB92197D8 for ; Wed, 4 Jan 2017 16:10:03 +0000 (UTC) Received: (qmail 76147 invoked by uid 500); 4 Jan 2017 16:10:03 -0000 Delivered-To: apmail-mesos-reviews-archive@mesos.apache.org Received: (qmail 76122 invoked by uid 500); 4 Jan 2017 16:10:03 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 76104 invoked by uid 99); 4 Jan 2017 16:10:03 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jan 2017 16:10:03 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id D78B2310EF0; Wed, 4 Jan 2017 16:10:02 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============0510213245040824683==" MIME-Version: 1.0 Subject: Re: Review Request 54996: Fix SIGBUS crash on ARM64/AArch64. From: Aaron Wood To: Jie Yu Cc: Mesos ReviewBot , Aaron Wood , mesos , James Peach , Till Toenshoff Date: Wed, 04 Jan 2017 16:10:02 -0000 Message-ID: <20170104161002.13478.43829@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Aaron Wood X-ReviewGroup: mesos X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/54996/ X-Sender: Aaron Wood References: <20170104054101.13569.39311@reviews.apache.org> In-Reply-To: <20170104054101.13569.39311@reviews.apache.org> Reply-To: Aaron Wood X-ReviewRequest-Repository: mesos --===============0510213245040824683== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Jan. 4, 2017, 5:41 a.m., James Peach wrote: > > 3rdparty/stout/include/stout/os/linux.hpp, line 120 > > > > > > Consider hoisting this into the `Stack` class: > > > > ``` > > void * Stack::start() { > > return (uint8_t *)address + size; > > } > > ``` Don't we want to avoid C-style casts? - Aaron ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54996/#review160464 ----------------------------------------------------------- On Jan. 4, 2017, 12:26 a.m., Aaron Wood wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54996/ > ----------------------------------------------------------- > > (Updated Jan. 4, 2017, 12:26 a.m.) > > > Review request for mesos and Jie Yu. > > > Bugs: MESOS-6835 > https://issues.apache.org/jira/browse/MESOS-6835 > > > Repository: mesos > > > Description > ------- > > Currently in the Linux launcher when the stack is allocated and prepared for a call to clone() it is not properly aligned. This is not an issue for x86 or x64 but for ARM64/AArch64 it is because of the requirement of having the stack aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have a 16 byte aligned stack, it is not enforced. An explanation of the stack and requirements for ARM64 can be found here http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf (specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be quad-word aligned.) > > Additionally, the way that the stack is currently allocated and passed to clone() accidentally chops off one entry, making a stack overflow using those missing 8 bytes a possibility. Fixing this while aligning the memory will fix both the issue of the stack overflow issue as well as the SIGBUS crash. We should also net better performance from having the stack aligned. > > > Diffs > ----- > > 3rdparty/stout/include/stout/os/linux.hpp 530f1a55b > src/linux/ns.hpp 77789717e > > Diff: https://reviews.apache.org/r/54996/diff/ > > > Testing > ------- > > Built Mesos from source and am currently running it in a test cluster. Launched both Docker and Mesos tasks via Marathon without any resulting crash (initial crash only happened with Mesos containerizer + linux_launcher, not with the posix_launcher). > > > Thanks, > > Aaron Wood > > --===============0510213245040824683==--