mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Review Request 54996: Fix SIGBUS crash on ARM64/AArch64.
Date Wed, 04 Jan 2017 19:28:55 GMT


> On Jan. 4, 2017, 5:58 p.m., Jie Yu wrote:
> > 3rdparty/stout/include/stout/os/linux.hpp, lines 57-60
> > <https://reviews.apache.org/r/54996/diff/2/?file=1596476#file1596476line57>
> >
> >     I would actually suggest keep this `create` method, but move the allocation
logic in `create`. 
> >     
> >     ```
> >     static Try<Stack> create(size_t size)
> >     {
> >       Stack stack(size);
> >       
> >       if (posix_memalign(...) != 0) {
> >         return ErrnoError("Failed to allocate stack");
> >       }
> >       
> >       return stack;
> >     }
> >     ```
> >     
> >     We can get rid of the `allocate` function. Once created, it's by default allocated.
> 
> Aaron Wood wrote:
>     `address` and `size` would need to be static for this. That opens up another issue
since all stacks would share the same data. Would it be better to get rid of the private constructor,
delete the copy constructor and assignment, and just have `allocate()` and `deallocate()`?

Why? you can do `stack.address` and `stack.size`.  constructor is private, meaning that only
'create' can construct the Stack. Removing copy and assignment operator sounds good. Can you
try that see if that compiles?


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54996/#review160513
-----------------------------------------------------------


On Jan. 4, 2017, 12:26 a.m., Aaron Wood wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54996/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2017, 12:26 a.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6835
>     https://issues.apache.org/jira/browse/MESOS-6835
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently in the Linux launcher when the stack is allocated and prepared for a call to
clone() it is not properly aligned. This is not an issue for x86 or x64 but for ARM64/AArch64
it is because of the requirement of having the stack aligned to a 16 byte boundary. While
x86 and x64 also expect the stack to have a 16 byte aligned stack, it is not enforced. An
explanation of the stack and requirements for ARM64 can be found here http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
(specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be quad-word aligned.)
> 
> Additionally, the way that the stack is currently allocated and passed to clone() accidentally
chops off one entry, making a stack overflow using those missing 8 bytes a possibility. Fixing
this while aligning the memory will fix both the issue of the stack overflow issue as well
as the SIGBUS crash. We should also net better performance from having the stack aligned.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/linux.hpp 530f1a55b 
>   src/linux/ns.hpp 77789717e 
> 
> Diff: https://reviews.apache.org/r/54996/diff/
> 
> 
> Testing
> -------
> 
> Built Mesos from source and am currently running it in a test cluster. Launched both
Docker and Mesos tasks via Marathon without any resulting crash (initial crash only happened
with Mesos containerizer + linux_launcher, not with the posix_launcher).
> 
> 
> Thanks,
> 
> Aaron Wood
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message