> On Oct. 15, 2018, 2:21 p.m., Benjamin Bannier wrote:
> > src/slave/state.hpp
> > Line 192 (original), 196 (patched)
> > <https://reviews.apache.org/r/69010/diff/1/?file=2096858#file2096858line196>
> >
> > I agree with James here. It seems totally fine to me to _always `sync`_ here.
Could we do that? Alternatively we could introduce a dedicated function with weaker guarantees
(e.g., `try_checkpoint`), but I don't see many good reasons for that, yet.
>
> Chun-Hung Hsiao wrote:
> Changing the behavior of checkpointing and backporting it without a thorough performance
evaluation doesn't sound a good idea to me. Also note that we disabled `O_SYNC` for better
performance before.
Some numbers for fsync found on the Internet: https://gist.github.com/prashanthpai/e246be62656f25d7e31b
- Chun-Hung
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/#review209542
-----------------------------------------------------------
On Oct. 16, 2018, 2:48 a.m., Chun-Hung Hsiao wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69010/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2018, 2:48 a.m.)
>
>
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
>
>
> Bugs: MESOS-9281
> https://issues.apache.org/jira/browse/MESOS-9281
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Currently if a system crashes, SLRP checkpoints might not be synced to
> the filesystem, so it is possible that an old or empty checkpoint will
> be read upon recovery. Moreover, if a CSI call has been issued right
> before the crash, the recovered state may be inconsistent with the
> actual state reported by the plugin. For example, the plugin might have
> created a volume but the checkpointed state does not know about it.
>
> To avoid this inconsistency, we always call fsync() when checkpointing
> SLRP states.
>
>
> Diffs
> -----
>
> src/resource_provider/storage/provider.cpp db783b53558811081fb2671e005e8bbbd9edbede
> src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20
>
>
> Diff: https://reviews.apache.org/r/69010/diff/3/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Chun-Hung Hsiao
>
>
|