mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 67871: Optimized the generation of metrics snapshots.
Date Mon, 16 Jul 2018 16:35:06 GMT


> On July 14, 2018, 7:54 p.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/metrics/metrics.cpp
> > Lines 164-172 (original), 168-177 (patched)
> > <https://reviews.apache.org/r/67871/diff/3/?file=2059029#file2059029line171>
> >
> >     Whoops, this expression contains both a moving and use of `futures` and the
evaluation order is undefined: 
> >     
> >     https://en.cppreference.com/w/cpp/language/eval_order
> >     
> >     If the compiler decides to evaluate defer before await, await will see an empty
vector and we'll likely see the timeout CHECK fail if no timeout was passed.
> >     
> >     I've been tripped up by this a bunch of times with moves, and one of those times
was exactly in this spot!
> >     
> >     https://issues.apache.org/jira/browse/MESOS-8970

lol, thanks for catching this Ben!! Much appreciated :) I've pushed an update to ensure our
desired order of evaluation.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67871/#review206090
-----------------------------------------------------------


On July 16, 2018, 4:34 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67871/
> -----------------------------------------------------------
> 
> (Updated July 16, 2018, 4:34 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gastón Kleiman, and James Peach.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Profiling of metrics generation revealed a large amount of time spent
> in map operations. This patch does three things to mitigate this:
> 
>  * Stores the metrics as an ordered map so that we only pay the price
>    of sorting when the metric is first added.
>  * Makes use of vectors instead of maps for intermediate objects,
>    which eliminates the need for another intermediate object.
>  * Hints when inserting into the returned map, reducing the cost of
>    insertion into that ordered container.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp f9b72029b2c85826c91b1d7656b0af94dc87010c

>   3rdparty/libprocess/src/metrics/metrics.cpp 4883c9acaa0cc568e27944661a8208f7b2a356a1

> 
> 
> Diff: https://reviews.apache.org/r/67871/diff/4/
> 
> 
> Testing
> -------
> 
> WITH per-framework metrics, BEFORE optimizations:
> ```
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
> Test setup: 1 agents with a total of 105 frameworks
> unversioned /metrics/snapshot' response took 157.1449ms
> v1 'master::call::GetMetrics' application/x-protobuf response took 152.599692ms
> v1 'master::call::GetMetrics' application/json response took 198.918334ms
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
(835 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
> Test setup: 1 agents with a total of 1005 frameworks
> unversioned /metrics/snapshot' response took 1.319444199secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 1.257644596secs
> v1 'master::call::GetMetrics' application/json response took 1.527225235secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
(6553 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
> Test setup: 1 agents with a total of 10005 frameworks
> unversioned /metrics/snapshot' response took 15.479365874secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 14.542866983secs
> v1 'master::call::GetMetrics' application/json response took 18.05492789secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
(75455 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
> Test setup: 1 agents with a total of 20005 frameworks
> unversioned /metrics/snapshot' response took 31.908301664secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 32.128689785secs
> v1 'master::call::GetMetrics' application/json response took 33.669376185secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
(150440 ms)
> ```
> 
> WITH per-framework metrics, AFTER optimizations:
> ```
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
> Test setup: 1 agents with a total of 105 frameworks
> unversioned /metrics/snapshot' response took 104.577895ms
> v1 'master::call::GetMetrics' application/x-protobuf response took 74.262533ms
> v1 'master::call::GetMetrics' application/json response took 100.218618ms
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
(562 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
> Test setup: 1 agents with a total of 1005 frameworks
> unversioned /metrics/snapshot' response took 921.175877ms
> v1 'master::call::GetMetrics' application/x-protobuf response took 780.277639ms
> v1 'master::call::GetMetrics' application/json response took 1.168651111secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
(5424 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
> Test setup: 1 agents with a total of 10005 frameworks
> unversioned /metrics/snapshot' response took 10.2413387secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 9.407992945secs
> v1 'master::call::GetMetrics' application/json response took 10.582584848secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
(57206 ms)
> [ RUN      ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
> Test setup: 1 agents with a total of 20005 frameworks
> unversioned /metrics/snapshot' response took 19.930542079secs
> v1 'master::call::GetMetrics' application/x-protobuf response took 20.318739763secs
> v1 'master::call::GetMetrics' application/json response took 22.853630899secs
> [       OK ] AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
(116363 ms)
> ```
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message