mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Bannier" <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 40849: Fix flaky MemoryPressureMesosTests
Date Wed, 09 Dec 2015 10:48:02 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40849/#review109494
-----------------------------------------------------------

Ship it!


Ship It!

- Benjamin Bannier


On Dec. 3, 2015, 7:01 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40849/
> -----------------------------------------------------------
> 
> (Updated Dec. 3, 2015, 7:01 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Greg Mann, Artem Harutyunyan, Jan Schlicht,
and Till Toenshoff.
> 
> 
> Bugs: MESOS-3586
>     https://issues.apache.org/jira/browse/MESOS-3586
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The existing tests will check that "low" pressure events occur at least as often as "medium"
pressure events (this is the documented behavior).  However, the order of events and the order
in which we process said events is not guaranteed.  When we collect the pressure events via
a counter, there may be some events that are in-flight, and thereby not accounted for in the
counters.
> 
> This patch modifies MemoryPressureMesosTests to wait for memory pressure events to stop
before checking the counts for correctness.
> The tests now stop the memory-pressure-triggering task and then wait for all events to
be processed before checking the counters.
> 
> 
> Diffs
> -----
> 
>   src/tests/containerizer/memory_pressure_tests.cpp e18b971c4df26c9b9c103ca73bdad4fd400d6c02

> 
> Diff: https://reviews.apache.org/r/40849/diff/
> 
> 
> Testing
> -------
> 
> On Debian 8, Ubuntu 14, Centos 7 (Thanks Jan!):
> `make check`
> `sudo bin/mesos-tests.sh --gtest_filter="*MemoryPressureMesosTest*" --gtest_repeat=-1
--gtest_break_on_failure`
> 
> ^ Ran the above for a couple minutes or ~100 times.  It was previously failing ~1/5 times.
> 
> ---
> Note on Centos 6 (Thanks Greg!):
> ```
> [ RUN      ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
> ../../src/tests/mesos.cpp:849: Failure
> Value of: _baseHierarchy.get()
>   Actual: "/cgroup"
> Expected: baseHierarchy
> Which is: "/tmp/mesos_test_cgroup"
> -------------------------------------------------------------
> Multiple cgroups base hierarchies detected:
>   '/tmp/mesos_test_cgroup'
>   '/cgroup'
> Mesos does not support multiple cgroups base hierarchies.
> Please unmount the corresponding (or all) subsystems.
> -------------------------------------------------------------
> ../../src/tests/mesos.cpp:932: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/tmp/mesos_test_cgroup/perf_event/mesos_test':
Device or resource busy
> [  FAILED  ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics (12 ms)
> ```
> 
> ---
> Note on Ubuntu 14:
> There is some other flakiness (in Agent recovery).  This will be tracked in https://issues.apache.org/jira/browse/MESOS-4047
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message