mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Lai <ja...@jasonlai.net>
Subject Re: Review Request 67264: Unmounted any dangling persistent volume in gc paths.
Date Fri, 25 May 2018 22:25:20 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67264/#review203912
-----------------------------------------------------------




src/slave/gc.cpp
Lines 221 (patched)
<https://reviews.apache.org/r/67264/#comment286292>

    You may want to iterate the mount entries in reversed order. Otherwise, you would likely
run into the cases where you try to unmount a parent directory before its descendant mount
points get umounted.



src/slave/gc.cpp
Lines 228 (patched)
<https://reviews.apache.org/r/67264/#comment286290>

    For checking whether a path is a descendant of a directory, it's not enough to just use
`strings::startsWith`, as you run into the case where `strings::startsWith("/mnt/something-else",
"/mnt/something")` returns `true`.
    
    It would be safer to check the following:
    1) Check if `entry.target` == `info->path`;
    2) Check if `strings::startsWith(entry.target, path::join(info->path, ""))` (`info->path`
suffied with a `"/"`);


- Jason Lai


On May 24, 2018, 7:48 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67264/
> -----------------------------------------------------------
> 
> (Updated May 24, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Jason Lai, and Jie Yu.
> 
> 
> Bugs: MESOS-8830
>     https://issues.apache.org/jira/browse/MESOS-8830
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> In various corner cases, agent may not get chance to properly unmount
> persistent volumes mounted inside an executor's sandbox. When GC later
> gets to these sandbox directories, permanent data loss can happen (see
> MESOS-8830).
> 
> This patch added some protection to unmount possible persistent
> volumes inside a path to gc, and skipped the path if unmount failed.
> 
> NOTE: this means agent will not garbage collect any path if it cannot
> read its own `mountinfo` table.
> 
> 
> Diffs
> -----
> 
>   src/local/local.cpp afff54653e8e659d947ddbee6dc38ba2715f2a78 
>   src/slave/gc.hpp df40165bb8a23f065156bf6c5f354b143d88c088 
>   src/slave/gc.cpp 390b35e6d17d6614a73c9548decbf10739560106 
>   src/slave/gc_process.hpp 20374ad91820341282fdf18ecade60a020e26cea 
>   src/slave/main.cpp 646125344d590b28256d8ee684d7e51a90e82f23 
>   src/slave/paths.hpp 015896453410a33923eed07b3e676be19af62a48 
>   src/slave/paths.cpp ed0b1276908f4990ce7a24c96aea20e8c79d3126 
>   src/tests/cluster.cpp b56212f6529a4d307e65797ad9bb34f2104fc832 
>   src/tests/gc_tests.cpp 619ed22edd9b3909ea24cdcbf62c354420a8d031 
>   src/tests/mesos.hpp 733344a2f07ebd9d841a55fb9bbfda2e3c1a1eb2 
>   src/tests/mesos.cpp d3c87c295429481c59d5a49398e289a4b84e4496 
>   src/tests/slave_tests.cpp 65d860594572b58a50a89358e31e97fd2a10bf08 
> 
> 
> Diff: https://reviews.apache.org/r/67264/diff/2/
> 
> 
> Testing
> -------
> 
> Tested with following procedures:
> 1. Start a test master and agent;
> 2. Created a persistent volume on agent through operator API;
> 3. Use `mesos-execute` to run a task;
> 4. Stop the agent;
> 5. Manually bind mount persistent volume path into a `volume` directory inside the executor
sandbox (to simulate a dangling mount in MESOS-8830);
> 6. Restart agent with `--gc_disk_headroom=1.0 --gc_delay=1secs` to force it gc the path
immediately.
> 
> With this fix, we observed that the dangling mount is automatically cleaned up, and agent
produces log line:
> ```
> W0523 06:00:04.001075 82745 gc.cpp:229] Unmounting dangling mount point '/home/zhitao/mesos-workdir/slaves/b3eb3aff-d19d-45ff-8113-f0316462d3fa-S0/frameworks/b3eb3aff-d19d-45ff-8113-f0316462d3fa-0000/executors/test_id/runs/1cd3bd06-2632-4541-a708-80c7cd51c74b/volume'
of persistent volume '/home/zhitao/mesos-workdir/volumes/roles/role/id1' inside garbage collected
path '/home/zhitao/mesos-workdir/slaves/b3eb3aff-d19d-45ff-8113-f0316462d3fa-S0'
> ```
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message