mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 65409: Fixed `SlaveRecoveryTest.ReconcileTasksMissingFromSlave`.
Date Fri, 09 Feb 2018 20:02:40 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65409/#review197185
-----------------------------------------------------------


Ship it!




Ship It!

- Joseph Wu


On Feb. 8, 2018, 11:53 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65409/
> -----------------------------------------------------------
> 
> (Updated Feb. 8, 2018, 11:53 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jie Yu, and Joseph Wu.
> 
> 
> Bugs: MESOS-6713
>     https://issues.apache.org/jira/browse/MESOS-6713
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Because it is not possible to delete a file (or a folder recursively)
> with open handles on Windows, we have to explicitly `reset()` the agent
> before removing the framework meta directory. Otherwise, the task status
> update manager will be destructed too late, and so an open handle for
> `task.updates` will cause the `os::rmdir` to fail.
> 
> This is safe because we previously destructed the agent anyway, just
> later in the test when it was reassigned.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp 77aa60c953bd0769eaba05f001755e4cec9ba028 
> 
> 
> Diff: https://reviews.apache.org/r/65409/diff/3/
> 
> 
> Testing
> -------
> 
> make check on CentOS 7, all passed
> ctest on Windows, all passed including new SlaveRecoveryTests
> 
> Note that while this chain enables recovery of Docker tasks on Windows, it explicitly
does not fix MESOS-8519 (recovery of job object tasks).
> 
> ```
> I0131 11:52:01.545505  8316 docker.cpp:898] Recovering Docker containers
> I0131 11:52:01.546005   660 containerizer.cpp:674] Recovering containerizer
> I0131 11:52:01.546505   660 containerizer.cpp:725] Skipping recovery of executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c'
of framework eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 because it was not launched from mesos
containerizer
> I0131 11:52:01.557006 11272 provisioner.cpp:493] Provisioner recovery complete
> I0131 11:52:02.521003  8720 docker.cpp:1008] Recovering container 'f7978e90-32f5-458d-ad4e-3ffa25a7b190'
for executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0131 11:52:02.530527  8316 slave.cpp:6695] Sending reconnect request to executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c'
of framework eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 at executor(1)@10.123.7.41:63903
> I0131 11:52:02.549062  8720 slave.cpp:4519] Received re-registration message from executor
'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0131 11:52:04.548064 10556 slave.cpp:4737] Cleaning up un-reregistered executors
> I0131 11:52:04.548064 10556 slave.cpp:6824] Finished recovery
> I0131 11:52:04.566066   660 task_status_update_manager.cpp:181] Pausing sending task
status updates
> I0131 11:52:04.567059 14636 slave.cpp:1146] New master detected at master@10.123.6.78:5050
> I0131 11:52:04.567059 14636 slave.cpp:1190] No credentials provided. Attempting to register
without authentication
> I0131 11:52:04.568047 14636 slave.cpp:1201] Detecting new master
> I0131 11:52:04.604035  8720 slave.cpp:1471] Re-registered with master master@10.123.6.78:5050
> I0131 11:52:04.605060   660 task_status_update_manager.cpp:188] Resuming sending task
status updates
> I0131 11:52:04.606036  8720 slave.cpp:1516] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"mzwol7M6SrGxOml4zYlA8Q=="},"slave_id":{"value":"7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S0"},"update_oversubscribed_resource
> s":true}
> I0131 11:52:04.612036  8720 slave.cpp:3625] Updating info for framework eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
with pid updated to scheduler-aaa62980-8b1b-4775-b8bb-c6890b41941e@10.123.6.78:45907
> I0131 11:52:04.636543 13468 task_status_update_manager.cpp:188] Resuming sending task
status updates
> ```
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message