mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Mahler" <benjamin.mah...@gmail.com>
Subject Review Request 41178: Fixed a message dropping bug in the health checker.
Date Thu, 10 Dec 2015 02:01:07 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41178/
-----------------------------------------------------------

Review request for mesos, Artem Harutyunyan and Timothy Chen.


Bugs: MESOS-1613 and MESOS-4106
    https://issues.apache.org/jira/browse/MESOS-1613
    https://issues.apache.org/jira/browse/MESOS-4106


Repository: mesos


Description
-------

Much like in the command executor, we need to sleep after we send
the final message in the health checker. Otherwise, we may exit
before libprocess is able to finish sending the message over the
local network.

This led to the following issues:
https://issues.apache.org/jira/browse/MESOS-1613
https://issues.apache.org/jira/browse/MESOS-4106


Diffs
-----

  src/health-check/main.cpp 83ee38cd853325b3adc7cb6bc2d1d67b343037f5 
  src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 

Diff: https://reviews.apache.org/r/41178/diff/


Testing
-------

Running the `HealthCheckTest.DISABLED_ConsecutiveFailures` test in repetition on a machine
loaded with many `openssl speed` commands in the background reproduces the flakiness. After
this patch it is no longer flaky in this setup.


Thanks,

Ben Mahler


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message