mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.
Date Tue, 16 Oct 2018 02:48:51 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/
-----------------------------------------------------------

(Updated Oct. 16, 2018, 2:48 a.m.)


Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.


Changes
-------

Addressed some of Benjamin's comments.


Bugs: MESOS-9281
    https://issues.apache.org/jira/browse/MESOS-9281


Repository: mesos


Description
-------

Currently if a system crashes, SLRP checkpoints might not be synced to
the filesystem, so it is possible that an old or empty checkpoint will
be read upon recovery. Moreover, if a CSI call has been issued right
before the crash, the recovered state may be inconsistent with the
actual state reported by the plugin. For example, the plugin might have
created a volume but the checkpointed state does not know about it.

To avoid this inconsistency, we always call fsync()  when checkpointing
SLRP states.


Diffs (updated)
-----

  src/resource_provider/storage/provider.cpp db783b53558811081fb2671e005e8bbbd9edbede 
  src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 


Diff: https://reviews.apache.org/r/69010/diff/3/

Changes: https://reviews.apache.org/r/69010/diff/2-3/


Testing
-------

make check


Thanks,

Chun-Hung Hsiao


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message