mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.
Date Sat, 13 Oct 2018 07:40:17 GMT


> On Oct. 13, 2018, 2:49 a.m., James DeFelice wrote:
> > src/slave/state.hpp
> > Line 192 (original), 196 (patched)
> > <https://reviews.apache.org/r/69010/diff/1/?file=2096858#file2096858line196>
> >
> >     Why is the default `false` here? If someone is calling the `checkpoint` func
because they want atomic semantics ... wouldn't they *normally* want `sync==true` (and shouldn't
the *exception* to the rule be `sync==false`)?

This is to maintain the same semantics for original uses of this function. IIRC most of the
components are able to handle stale or empty checkpoints. Also defaulting the flag to true
will bring in some performance impact.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/#review209512
-----------------------------------------------------------


On Oct. 12, 2018, 11:19 p.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69010/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2018, 11:19 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9281
>     https://issues.apache.org/jira/browse/MESOS-9281
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently if a system crashes, SLRP checkpoints might not be synced to
> the filesystem, so it is possible that an old or empty checkpoint will
> be read upon recovery. Moreover, if a CSI call has been issued right
> before the crash, the recovered state may be inconsistent with the
> actual state reported by the plugin. For example, the plugin might have
> created a volume but the checkpointed state does not know about it.
> 
> To avoid this inconsistency, we always call fsync()  when checkpointing
> SLRP states.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/storage/provider.cpp db783b53558811081fb2671e005e8bbbd9edbede

>   src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 
> 
> 
> Diff: https://reviews.apache.org/r/69010/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message