mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Budnik <abud...@mesosphere.com>
Subject Re: Review Request 69972: Skipped the container which has no checkpointed volumes during recovery.
Date Wed, 13 Feb 2019 13:28:21 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69972/#review212795
-----------------------------------------------------------



Thanks for the patch!
I think we should implement a test for this. Otherwise, it would be very dangerous to _refactor_
this part of code in the future.
If you have no chance to implement a test for this now, please feel free to file a ticket.


src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp
Lines 300 (patched)
<https://reviews.apache.org/r/69972/#comment298698>

    Given that `state::checkpoint` is **atomic**, we can not end up in the state where the
file is empty because the agent did not finish writing to it.
    
    However, an empty file might occur in case of hard reboot of the agent's host. This happens
because page cache is dumped every 20 seconds by default in Linux. There is a chance that
the file is created, but data has not yet synced on disk.
    
    As we have agreed with Gilbert, we need to ignore empty files **only** in case of orphan
containers.


- Andrei Budnik


On Feb. 13, 2019, 8:26 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69972/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2019, 8:26 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Gilbert Song.
> 
> 
> Bugs: MESOS-9507
>     https://issues.apache.org/jira/browse/MESOS-9507
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> There are two cases we need to handle:
>   1. The checkpointed docker volumes file does not exist.
>   2. The checkpointed docker volumes file is empty.
> For both of the two cases, in the recovery of `docker/volume` isolator,
> we should remove the container's checkpoint directory and then skip the
> container.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp a72fc84da6fb0f24d363dd4c635500510da675d8

> 
> 
> Diff: https://reviews.apache.org/r/69972/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message