mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiang Yan Xu <...@jxu.me>
Subject Review Request 63174: Added a benchmark for agent reregistration during master failover.
Date Thu, 19 Oct 2017 23:28:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63174/
-----------------------------------------------------------

Review request for mesos, Benjamin Mahler, Dmitry Zhuk, and Ilya Pronin.


Bugs: MESOS-8098
    https://issues.apache.org/jira/browse/MESOS-8098


Repository: mesos


Description
-------

The current benchmark is very simple: without framework involvement and without agent retries
but it's possible to add a number of others so I am creating a new file for them.


Diffs
-----

  src/Makefile.am 936bc49ddfca03b9278ab11b6d317f3ff635cb00 
  src/tests/CMakeLists.txt 386e0473c93d0a993248c7818067071d0c761c76 
  src/tests/master_benchmarks.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/63174/diff/1/


Testing
-------

Benchmark based off https://github.com/apache/mesos/commit/41193181d6b75eeecae2729bf98007d9318e351a
(close to current HEAD).

```
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Reregistered 2000 agents with a total of 500000 running tasks and 500000 completed tasks in
45.075488ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(48126 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Reregistered 2000 agents with a total of 1000000 running tasks and 0 completed tasks in 14.172361ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(45979 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Reregistered 20000 agents with a total of 1000000 running tasks and 0 completed tasks in 413.508328ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(49487 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (143596 ms
total)

...

[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Reregistered 2000 agents with a total of 500000 running tasks and 500000 completed tasks in
32.787363ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(48266 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Reregistered 2000 agents with a total of 1000000 running tasks and 0 completed tasks in 19.735003ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(46169 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Reregistered 20000 agents with a total of 1000000 running tasks and 0 completed tasks in 321.267267ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(51550 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (145987 ms
total)
```

Benchmark based off https://github.com/apache/mesos/commit/d9c90bf1d9c8b3a7dcc47be0cb773efff57cfb9d
(before https://issues.apache.org/jira/browse/MESOS-7713 was merged)
```
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Reregistered 2000 agents with a total of 500000 running tasks and 500000 completed tasks in
85.800335ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(59247 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Reregistered 2000 agents with a total of 1000000 running tasks and 0 completed tasks in 35.342066ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(93662 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Reregistered 20000 agents with a total of 1000000 running tasks and 0 completed tasks in 798.738642ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(116078 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (268987 ms
total)

...

[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Reregistered 2000 agents with a total of 500000 running tasks and 500000 completed tasks in
66.270249ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(59925 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Reregistered 2000 agents with a total of 1000000 running tasks and 0 completed tasks in 50.146349ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(88631 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Reregistered 20000 agents with a total of 1000000 running tasks and 0 completed tasks in 807.621964ms
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(109941 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (258497 ms
total)
```

The recently patches cut down the time by nearly 50%. These were built with `--enable-optimize`.

I can also get some flame graphs.


Thanks,

Jiang Yan Xu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message