mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <>
Subject Re: Review Request 51299: Fixed memory leak in master during framework teardown.
Date Mon, 03 Oct 2016 20:54:35 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Oct. 3, 2016, 1:54 p.m.)

Review request for mesos, Anand Mazumdar, Greg Mann, and Artem Harutyunyan.


Reword some comments about the prior behavior (of not discarding) to mention how the future
is discarded in this case.

Repository: mesos


The `process::http::Pipe` class leaks its underlying `shared_ptr`
due to how the master (accidentally) causes the `shared_ptr` to hold
a self-reference.

When the master receives a `SUBSCRIBE` call from an V1 HTTP API
framework, the master creates a `process::http::Pipe` to manage the
reading and writing in separate locations in the code.  For 
synchronization, the read/write ends of the HTTP connection share
some state (via `shared_ptr`).

The master introduces a self-reference via this line in
    .onAny(defer(self(), &Self::exited, framework->id(), http));

`http.closed()` returns a future managed by the read-end of the `Pipe`.
`http` holds a copy of the write-end of the `Pipe`, which in turn has
a copy of the `shared_ptr`.  Because `http` is passed into the future's 
continuation, a copy of `http` will live inside the read-end's future 
until the either (a) the continuation is executed or (b) the future 
is destructed.

Due to how we manage the future, neither (a) nor (b) are performed:
(a) When the read-end of the `Pipe` is closed, we only complete the
    future if the write-end of the `Pipe` is still open.  During
    framework teardown, the write-end is closed first.
(b) The future in question lives inside a `Promise`, which lives
    inside the `shared_ptr`.  Because the self-reference exists, the
    `shared_ptr` is never destructed; which means the `Promise` and
    future are never destructed either.


This patch fixes the leak by making sure the continuation is always
executed (a) or discarded.  It does so by discarding the related 
future when the write-end of the `Pipe` is already closed.

Diffs (updated)

  3rdparty/libprocess/include/process/http.hpp 404196bb198c1ff958b55d72fb29c5fe92dba429 
  3rdparty/libprocess/src/http.cpp 298bd460afdeccd18e201d8a505b961fd4cf3d3e 
  3rdparty/libprocess/src/tests/http_tests.cpp 2538f56c88f1c6074a0d41182a2242a6fdd105f4 



Found this leak in collaboration with Greg :)

Ran valgrind before applying this patch:
LD_RUN_PATH=/path/to/mesos/build/src/.libs LD_LIBRARY_PATH=/path/to/mesos/build/src/.libs
valgrind --leak-check=full src/.libs/mesos-tests --gtest_filter="*SchedulerTest.Teardown*"

Observed the following leak (among many false positives):
1,881 (216 direct, 1,665 indirect) bytes in 1 blocks are definitely lost in loss record 2,405
of 2,507
   at 0x4C2A105: operator new(unsigned long) (in /usr/lib64/valgrind/
   by 0xF259A9: process::http::Pipe::Pipe() (http.hpp:414)
   by 0x5D629A1: mesos::internal::master::Master::Http::scheduler(process::http::Request const&,
Option<std::string> const&) const (http.cpp:796)
   by 0x5DB5806: operator() (master.cpp:858)

Applied the patch and re-ran valgrind.  Confirmed that leak is gone.


Joseph Wu

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message