mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Peach <jpe...@apache.org>
Subject Re: Review Request 54996: Fix SIGBUS crash on ARM64/AArch64.
Date Wed, 04 Jan 2017 05:41:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54996/#review160464
-----------------------------------------------------------




3rdparty/stout/include/stout/os/linux.hpp (line 60)
<https://reviews.apache.org/r/54996/#comment231621>

    I don't think that you need this static constructor. You can do everything you need in
`allocate`, which is them symmetric with `deallocate`.



3rdparty/stout/include/stout/os/linux.hpp (line 63)
<https://reviews.apache.org/r/54996/#comment231622>

    I'd recommend something like this:
    ```
    Try<Nothing> allocate(size_t nbytes) {
        int error = ::posix_memalign(&address, os::getpagesize(), nbytes);
        if (error) {
            return ErrnoError(error);
        }
        
        return Nothing();
    }
    ```



3rdparty/stout/include/stout/os/linux.hpp (line 75)
<https://reviews.apache.org/r/54996/#comment231625>

    You can simplify this to just:
    ```
    void deallocate() {
      ::free(address);
      
      address = nullptr;
      size = 0;
    }
    ```



3rdparty/stout/include/stout/os/linux.hpp (line 79)
<https://reviews.apache.org/r/54996/#comment231624>

    You can avoid the `reinterpret_cast` by typing this as `void *`. If you add the `start()`
method suggested below, you can also make it private.



3rdparty/stout/include/stout/os/linux.hpp (line 82)
<https://reviews.apache.org/r/54996/#comment231620>

    You should explictly delete the copy constructors here to ensure the stack can't be leaked
if the code changes:
    
    ```
    Stack(const Stack&) = delete;
    Stack& operator=(const Stack&) = delete;
    ```



3rdparty/stout/include/stout/os/linux.hpp (line 105)
<https://reviews.apache.org/r/54996/#comment231626>

    Since this magic number appears twice I'd be inclined to hoist it into `Stack`:
    ```
    class Stack {
      ...
      static constexpr size_t DefaultSize = 8 * 1024 * 1024;
      ...
    };
    ```



3rdparty/stout/include/stout/os/linux.hpp (line 118)
<https://reviews.apache.org/r/54996/#comment231623>

    Consider hoisting this into the `Stack` class:
    
    ```
    void * Stack::start() {
      return (uint8_t *)address + size;
    }
    ```


- James Peach


On Jan. 4, 2017, 12:26 a.m., Aaron Wood wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54996/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2017, 12:26 a.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6835
>     https://issues.apache.org/jira/browse/MESOS-6835
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently in the Linux launcher when the stack is allocated and prepared for a call to
clone() it is not properly aligned. This is not an issue for x86 or x64 but for ARM64/AArch64
it is because of the requirement of having the stack aligned to a 16 byte boundary. While
x86 and x64 also expect the stack to have a 16 byte aligned stack, it is not enforced. An
explanation of the stack and requirements for ARM64 can be found here http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
(specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be quad-word aligned.)
> 
> Additionally, the way that the stack is currently allocated and passed to clone() accidentally
chops off one entry, making a stack overflow using those missing 8 bytes a possibility. Fixing
this while aligning the memory will fix both the issue of the stack overflow issue as well
as the SIGBUS crash. We should also net better performance from having the stack aligned.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/linux.hpp 530f1a55b 
>   src/linux/ns.hpp 77789717e 
> 
> Diff: https://reviews.apache.org/r/54996/diff/
> 
> 
> Testing
> -------
> 
> Built Mesos from source and am currently running it in a test cluster. Launched both
Docker and Mesos tasks via Marathon without any resulting crash (initial crash only happened
with Mesos containerizer + linux_launcher, not with the posix_launcher).
> 
> 
> Thanks,
> 
> Aaron Wood
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message