mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.
Date Wed, 12 Jun 2019 20:14:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215855
-----------------------------------------------------------


Ship it!




Refactor looks good.  One minor tweak to consider below:


3rdparty/libprocess/src/metrics/metrics.cpp
Lines 221-237 (patched)
<https://reviews.apache.org/r/70783/#comment302792>

    Consider merging these two into a single block:
    ```
    if (timeout.isNone() || timeout.get() > SLOW_METRICS_TIMEOUT) {
      ...
    }
    ```


- Joseph Wu


On June 10, 2019, 2:49 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> -----------------------------------------------------------
> 
> (Updated June 10, 2019, 2:49 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds a hard-coded timeout to the metrics snapshot
> handler which will log the names of any metrics which have not
> become ready after 10 seconds. A slight refactor of the
> snapshot code is done as well.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 75711edbaf46797e5eb54ba720ea11cf3de81522

>   3rdparty/libprocess/src/metrics/metrics.cpp 623d44adbe838f995ddbe89ee26f5bcc9c600be5

>   3rdparty/libprocess/src/tests/metrics_tests.cpp 881275693e67f3c9fb670c7e70cb5014090ed7a5

> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/4/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Also ran some metrics benchmark tests both before and after the change.
> 
> Before patch:
> 
> [==========] Running 4 tests from 1 test case.
> [----------] Global test environment set-up.
> [----------] 4 tests from AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
> Test setup: 1 agents with a total of 100 frameworks
> unversioned /metrics/snapshot' response took 144.821514ms
> v1 'master::call::GetMetrics' application/x-protobuf response took 147.056543ms
> v1 'master::call::GetMetrics' application/json response took 175.9909ms
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
(935 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
> Test setup: 1 agents with a total of 1000 frameworks
> unversioned /metrics/snapshot' response took 1.083320021secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 1.105169806secs
> v1 'master::call::GetMetrics' application/json response took 1.238749012secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
(7044 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
> Test setup: 1 agents with a total of 10000 frameworks
> unversioned /metrics/snapshot' response took 9.400786071secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 9.709327709secs
> v1 'master::call::GetMetrics' application/json response took 11.207594934secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
(62656 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
> Test setup: 1 agents with a total of 20000 frameworks
> unversioned /metrics/snapshot' response took 17.926824609secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 19.467704754secs
> v1 'master::call::GetMetrics' application/json response took 21.32824915secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
(132619 ms)
> [----------] 4 tests from AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test
(203255 ms total)
> 
> [----------] Global test environment tear-down
> [==========] 4 tests from 1 test case ran. (203343 ms total)
> [  PASSED  ] 4 tests.
> 
> 
> After patch:
> 
> [==========] Running 4 tests from 1 test case.
> [----------] Global test environment set-up.
> [----------] 4 tests from AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
> Test setup: 1 agents with a total of 100 frameworks
> unversioned /metrics/snapshot' response took 143.560831ms
> v1 'master::call::GetMetrics' application/x-protobuf response took 154.460223ms
> v1 'master::call::GetMetrics' application/json response took 186.55918ms
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
(987 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
> Test setup: 1 agents with a total of 1000 frameworks
> unversioned /metrics/snapshot' response took 1.082617553secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 1.097321636secs
> v1 'master::call::GetMetrics' application/json response took 1.273446816secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
(6996 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
> Test setup: 1 agents with a total of 10000 frameworks
> unversioned /metrics/snapshot' response took 9.327752261secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 9.944210962secs
> v1 'master::call::GetMetrics' application/json response took 10.800611643secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
(61873 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
> Test setup: 1 agents with a total of 20000 frameworks
> unversioned /metrics/snapshot' response took 17.728278409secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 19.944552117secs
> v1 'master::call::GetMetrics' application/json response took 22.376012355secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
(132274 ms)
> [----------] 4 tests from AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test
(202132 ms total)
> 
> [----------] Global test environment tear-down
> [==========] 4 tests from 1 test case ran. (202272 ms total)
> [  PASSED  ] 4 tests.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message