mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Wood <aaron.w...@verizon.com>
Subject Re: Review Request 54996: Fix SIGBUS crash on ARM64/AArch64.
Date Wed, 04 Jan 2017 00:26:48 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54996/
-----------------------------------------------------------

(Updated Jan. 4, 2017, 12:26 a.m.)


Review request for mesos and Jie Yu.


Changes
-------

Created stack class so that the logic for creating an aligned stack can be shared. Used this
new class in another area of the codebase where a stack was being allocated.


Bugs: MESOS-6835
    https://issues.apache.org/jira/browse/MESOS-6835


Repository: mesos


Description
-------

Currently in the Linux launcher when the stack is allocated and prepared for a call to clone()
it is not properly aligned. This is not an issue for x86 or x64 but for ARM64/AArch64 it is
because of the requirement of having the stack aligned to a 16 byte boundary. While x86 and
x64 also expect the stack to have a 16 byte aligned stack, it is not enforced. An explanation
of the stack and requirements for ARM64 can be found here http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
(specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be quad-word aligned.)

Additionally, the way that the stack is currently allocated and passed to clone() accidentally
chops off one entry, making a stack overflow using those missing 8 bytes a possibility. Fixing
this while aligning the memory will fix both the issue of the stack overflow issue as well
as the SIGBUS crash. We should also net better performance from having the stack aligned.


Diffs (updated)
-----

  3rdparty/stout/include/stout/os/linux.hpp 530f1a55b 
  src/linux/ns.hpp 77789717e 

Diff: https://reviews.apache.org/r/54996/diff/


Testing
-------

Built Mesos from source and am currently running it in a test cluster. Launched both Docker
and Mesos tasks via Marathon without any resulting crash (initial crash only happened with
Mesos containerizer + linux_launcher, not with the posix_launcher).


Thanks,

Aaron Wood


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message