mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Mahler" <benjamin.mah...@gmail.com>
Subject Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.
Date Fri, 09 Oct 2015 22:54:40 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102111
-----------------------------------------------------------


Overall looks good, some of the comments I made can be addressed in separate reviews.


include/mesos/containerizer/containerizer.proto (lines 88 - 92)
<https://reviews.apache.org/r/38746/#comment159628>

    Why do you say "executor" here but termination is about "container" above?
    
    Also, can you:
    
    s/task_state/state/
    s/task_status_reasons/reasons/
    
    s/'reason'/'reasons'/
    s/of a status update for those pending/unterminated tasks/of status updates for non-terminal
tasks/



include/mesos/mesos.proto (line 1100)
<https://reviews.apache.org/r/38746/#comment159647>

    0 should have been UNKNOWN, ugh :(



include/mesos/mesos.proto (lines 1119 - 1122)
<https://reviews.apache.org/r/38746/#comment159626>

    Can you keep these alphabetical?
    
    It's a bit unfortunate that we don't have a common "REASON_CONTAINER_LIMITATION" prefix
for the limitations. Lesson learned for the next time we decide to name enums to use use prefixing
to group related enums...
    
    Do you want to add a generic limitation? I believe it's ok to rename these from a conversation
with vinod, given they were intended for metrics rather than control flow in schedulers:
    
    REASON_CONTAINER_LIMITATION
    REASON_CONTAINER_LIMITATION_DISK
    REASON_CONTAINER_LIMITATION_MEMORY
    
    Note the presence of a general limitation that custom isolators can use.



include/mesos/slave/isolator.proto (line 42)
<https://reviews.apache.org/r/38746/#comment159631>

    How about: s/for those tasks that are in non-terminal status when the container is terminated/for
any remaining non-terminal tasks/ ?



include/mesos/slave/isolator.proto (line 44)
<https://reviews.apache.org/r/38746/#comment159630>

    Why not s/task_status_reason/reason/ ?



src/slave/containerizer/mesos/containerizer.cpp (lines 1235 - 1237)
<https://reviews.apache.org/r/38746/#comment159645>

    Can you use the arrow operator here?
    
    status->isSome
    status->get



src/slave/containerizer/mesos/containerizer.cpp (line 1243)
<https://reviews.apache.org/r/38746/#comment159643>

    ! empty() ?



src/slave/containerizer/mesos/containerizer.cpp (line 1257)
<https://reviews.apache.org/r/38746/#comment159644>

    Why did you need the trim here?



src/slave/slave.hpp (lines 306 - 307)
<https://reviews.apache.org/r/38746/#comment159632>

    Could you wrap on the next line, as is done with getExecutorInfo?



src/slave/slave.hpp (lines 629 - 636)
<https://reviews.apache.org/r/38746/#comment159634>

    Hm.. I think people mind get a bit confused when they encounter this, how about we add
a comment to guide them and rename it like the following:
    
    ```
    // When the agent initiates a destroyal of the container,
    // we expect a termination to occur. The 'pendingTermation'
    // indicates why the agent initiated the destruction and
    // will influence the information sent in the status updates
    // for any remaining non-terminal tasks.
    Option<Termination> pendingTermination;
    ```



src/slave/slave.cpp (line 1661)
<https://reviews.apache.org/r/38746/#comment159648>

    Shall we rename this to 'update' to be a bit clearer?



src/slave/slave.cpp (line 2656)
<https://reviews.apache.org/r/38746/#comment159649>

    Ditto here.



src/slave/slave.cpp (lines 2668 - 2678)
<https://reviews.apache.org/r/38746/#comment159661>

    Weird that some of the existing code paths don't set the executor state to TERMINATING,
do we need a function to make the destroyal code consistent?
    
    Shall we follow up in a separate patch?



src/slave/slave.cpp (line 2716)
<https://reviews.apache.org/r/38746/#comment159650>

    Is it easy to add the value of the timeout here? E.g.
    
    ```
    Executor did not re-register within 10secs
    ```



src/slave/slave.cpp (line 2931)
<https://reviews.apache.org/r/38746/#comment159651>

    Ditto here.



src/slave/slave.cpp (line 2955)
<https://reviews.apache.org/r/38746/#comment159652>

    Please use the arrow operator going forward :)
    
    future->isFailed
    future->failure



src/slave/slave.cpp (line 3248)
<https://reviews.apache.org/r/38746/#comment159635>

    Mind avoiding the ternary just for clarity? Up to you.



src/slave/slave.cpp (line 3372)
<https://reviews.apache.org/r/38746/#comment159653>

    Ditto here, this is 'launched'?



src/slave/slave.cpp (line 3398)
<https://reviews.apache.org/r/38746/#comment159662>

    Odd that we don't go set the state to TERMINATING here as well, for example.



src/slave/slave.cpp (lines 3954 - 3955)
<https://reviews.apache.org/r/38746/#comment159654>

    Ditto here about adding the value if possible



src/slave/slave.cpp (line 4398)
<https://reviews.apache.org/r/38746/#comment159667>

    Odd that this isn't 'corrections', looks like we don't need the 'corrections' variable
below, not sure why we bother with the extra variable here. But this is for another patch..



src/slave/slave.cpp (lines 4709 - 4740)
<https://reviews.apache.org/r/38746/#comment159687>

    Can you use the arrow operator here on 'termination'?



src/slave/slave.cpp (line 4720)
<https://reviews.apache.org/r/38746/#comment159688>

    How about:
    
    ```
    (!termination->reasons().empty)
    ```
    
    instead of the size check



src/tests/slave_recovery_tests.cpp (lines 594 - 596)
<https://reviews.apache.org/r/38746/#comment159638>

    Here and the one below in this file, could you also add assertions for the expected reason?
    
    It would be nice to do a pass adding in more reason assertions across the tests to validate
this change, either in this patch or in a follow up patch (the latter ideally).


- Ben Mahler


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c

>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be

>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981

>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520

>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df

>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052

>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d

>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c

>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message