mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gastón Kleiman <>
Subject Re: Review Request 69977: Improved agent operation recovery process.
Date Wed, 20 Feb 2019 22:58:32 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Feb. 20, 2019, 2:58 p.m.)

Review request for mesos, Chun-Hung Hsiao and Greg Mann.


Fixed typo and whitspace.

Bugs: MESOS-8054

Repository: mesos


This patch makes the agent walk the operation status update streams
directories in order to generate the list of streams to recover, instead
of generating it from the checkpointed `ResourceState` message.

This prevents the agent from asking the operation status update manager
to recover streams that haven't been created yet.

The patch also makes the agent garbage collect operation status update
streams if no correspondng operation is present in the checkpointed
state. This can happen after the agent fails over while processing the
acknowledgement of a terminal operation status update.

Diffs (updated)

  src/slave/slave.cpp e3c2c005d865b5c333e92e50e49ef398fe06ad79 




Manual testing + existing tests still pass.


Gastón Kleiman

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message