mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos Reviewbot <revi...@mesos.apache.org>
Subject Re: Review Request 67264: Unmounted any dangling persistent volume in gc paths.
Date Fri, 25 May 2018 02:12:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67264/#review203838
-----------------------------------------------------------



Patch looks great!

Reviews applied: [67264]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose
--disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On May 24, 2018, 7:48 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67264/
> -----------------------------------------------------------
> 
> (Updated May 24, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos, Jason Lai and Jie Yu.
> 
> 
> Bugs: MESOS-8830
>     https://issues.apache.org/jira/browse/MESOS-8830
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> In various corner cases, agent may not get chance to properly unmount
> persistent volumes mounted inside an executor's sandbox. When GC later
> gets to these sandbox directories, permanent data loss can happen (see
> MESOS-8830).
> 
> This patch added some protection to unmount possible persistent
> volumes inside a path to gc, and skipped the path if unmount failed.
> 
> NOTE: this means agent will not garbage collect any path if it cannot
> read its own `mountinfo` table.
> 
> 
> Diffs
> -----
> 
>   src/local/local.cpp afff54653e8e659d947ddbee6dc38ba2715f2a78 
>   src/slave/gc.hpp df40165bb8a23f065156bf6c5f354b143d88c088 
>   src/slave/gc.cpp 390b35e6d17d6614a73c9548decbf10739560106 
>   src/slave/gc_process.hpp 20374ad91820341282fdf18ecade60a020e26cea 
>   src/slave/main.cpp 646125344d590b28256d8ee684d7e51a90e82f23 
>   src/slave/paths.hpp 015896453410a33923eed07b3e676be19af62a48 
>   src/slave/paths.cpp ed0b1276908f4990ce7a24c96aea20e8c79d3126 
>   src/tests/cluster.cpp b56212f6529a4d307e65797ad9bb34f2104fc832 
>   src/tests/gc_tests.cpp 619ed22edd9b3909ea24cdcbf62c354420a8d031 
>   src/tests/mesos.hpp 733344a2f07ebd9d841a55fb9bbfda2e3c1a1eb2 
>   src/tests/mesos.cpp d3c87c295429481c59d5a49398e289a4b84e4496 
>   src/tests/slave_tests.cpp 65d860594572b58a50a89358e31e97fd2a10bf08 
> 
> 
> Diff: https://reviews.apache.org/r/67264/diff/2/
> 
> 
> Testing
> -------
> 
> Tested with following procedures:
> 1. Start a test master and agent;
> 2. Created a persistent volume on agent through operator API;
> 3. Use `mesos-execute` to run a task;
> 4. Stop the agent;
> 5. Manually bind mount persistent volume path into a `volume` directory inside the executor
sandbox (to simulate a dangling mount in MESOS-8830);
> 6. Restart agent with `--gc_disk_headroom=1.0 --gc_delay=1secs` to force it gc the path
immediately.
> 
> With this fix, we observed that the dangling mount is automatically cleaned up, and agent
produces log line:
> ```
> W0523 06:00:04.001075 82745 gc.cpp:229] Unmounting dangling mount point '/home/zhitao/mesos-workdir/slaves/b3eb3aff-d19d-45ff-8113-f0316462d3fa-S0/frameworks/b3eb3aff-d19d-45ff-8113-f0316462d3fa-0000/executors/test_id/runs/1cd3bd06-2632-4541-a708-80c7cd51c74b/volume'
of persistent volume '/home/zhitao/mesos-workdir/volumes/roles/role/id1' inside garbage collected
path '/home/zhitao/mesos-workdir/slaves/b3eb3aff-d19d-45ff-8113-f0316462d3fa-S0'
> ```
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message