mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.
Date Wed, 31 Oct 2018 18:21:46 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/
-----------------------------------------------------------

(Updated Oct. 31, 2018, 6:21 p.m.)


Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.


Changes
-------

Added more comments.


Bugs: MESOS-9281
    https://issues.apache.org/jira/browse/MESOS-9281


Repository: mesos


Description
-------

Currently if a system crashes, SLRP checkpoints might not be synced to
the filesystem, so it is possible that an old or empty checkpoint will
be read upon recovery. Moreover, if a CSI call has been issued right
before the crash, the recovered state may be inconsistent with the
actual state reported by the plugin. For example, the plugin might have
created a volume but the checkpointed state does not know about it.

To avoid this inconsistency, we always call fsync()  when checkpointing
SLRP states.


Diffs (updated)
-----

  src/resource_provider/storage/provider.cpp db783b53558811081fb2671e005e8bbbd9edbede 
  src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 


Diff: https://reviews.apache.org/r/69010/diff/8/

Changes: https://reviews.apache.org/r/69010/diff/7-8/


Testing
-------

make check


Thanks,

Chun-Hung Hsiao


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message