mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Peach <jpe...@apache.org>
Subject Re: Review Request 65954: Add a gauge for how long agent recovery takes.
Date Fri, 09 Mar 2018 18:53:48 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65954/#review198952
-----------------------------------------------------------




src/slave/metrics.hpp
Lines 45 (patched)
<https://reviews.apache.org/r/65954/#comment279228>

    This doesn't need to be atomic. The reader will just read either the old or new values
and it doesn't matter which it gets.



src/slave/metrics.cpp
Lines 257 (patched)
<https://reviews.apache.org/r/65954/#comment279226>

    My suggestion for the metric name is:
    ```
    slave/recovery_time_secs
    ```



src/slave/metrics.cpp
Lines 259 (patched)
<https://reviews.apache.org/r/65954/#comment279227>

    I don't know that I like the idea of a metric that is absent and then present. I'd prefer
that we just published a `0.0` until recovert is complete.
    
    Suggest we keep the recovery timestamp in the `Slave` and just publish that.



src/slave/slave.cpp
Lines 7322 (patched)
<https://reviews.apache.org/r/65954/#comment279229>

    Since the gauge is being published in seconds, you need to use `Duration::secs` to convert.


- James Peach


On March 7, 2018, 11:20 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65954/
> -----------------------------------------------------------
> 
> (Updated March 7, 2018, 11:20 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8609
>     https://issues.apache.org/jira/browse/MESOS-8609
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The new metric `slave/recover_secs` can be used to tell us how long
> Mesos agent needed to finish its recovery cycle. This is an important
> metric on agent machines which have a lot of completed executor
> sandboxes.
> 
> Note that the metric 1) will only be available after recovery succeeded
> and 2) never change its value across agent process lifecycle afterwards.
> 
> 
> Diffs
> -----
> 
>   src/slave/metrics.hpp 3fc933ca65690d6fad63156398ad9c2c53789296 
>   src/slave/metrics.cpp 0eb2b59ed67e14e73b29d7592c239441df0008d5 
>   src/slave/slave.cpp e2facb3c15a2f907f6497c58a36842ed707f2c70 
> 
> 
> Diff: https://reviews.apache.org/r/65954/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message