mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos Reviewbot Windows <revi...@mesos.apache.org>
Subject Re: Review Request 66232: Removed unnecessary/invalid checks from the default executor.
Date Fri, 23 Mar 2018 03:24:35 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66232/#review199842
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['65692', '65693', '66232']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66232

Relevant logs:

- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66232/logs/mesos-tests-stdout.log):

```
[       OK ] Endpoint/SlaveEndpointTest.NoAuthorizer/2 (120 ms)
[----------] 9 tests from Endpoint/SlaveEndpointTest (1105 ms total)

[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN      ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[       OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (32 ms)
[ RUN      ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[       OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (39 ms)
[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (73 ms total)

[----------] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN      ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[       OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (906 ms)
[----------] 1 test from IsolationFlag/CpuIsolatorTest (933 ms total)

[----------] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN      ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[       OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (839 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (870 ms total)

[----------] Global test environment tear-down
[==========] 943 tests from 94 test cases ran. (447506 ms total)
[  PASSED  ] 942 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CommandExecutorCheckTest.CommandCheckTimeout

 1 FAILED TEST
  YOU HAVE 215 DISABLED TESTS

```

- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66232/logs/mesos-tests-stderr.log):

```
I0323 03:24:22.550164  9020 master.cpp:10255] Updating the state of task bfa57eef-68c4-480c-ab9e-8713821be243
of framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-0000 (latest state: TASK_KILLED, status
update state: TASK_KILLED)
I0323 03:24:22.550164  5672 slave.cpp:3873] Shutting down framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-00I0323
03:24:22.362180 10428 exec.cpp:162] Version: 1.6.0
I0323 03:24:22.390198  7244 exec.cpp:236] Executor registered on agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
I0323 03:24:22.395151 11256 executor.cpp:176] Received SUBSCRIBED event
I0323 03:24:22.399148 11256 executor.cpp:180] Subscribed executor on winbldsrv-01.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0323 03:24:22.399148 11256 executor.cpp:176] Received LAUNCH event
I0323 03:24:22.405328 11256 executor.cpp:648] Starting task bfa57eef-68c4-480c-ab9e-8713821be243
I0323 03:24:22.488149 11256 executor.cpp:483] Running 'D:\DCOS\mesos\src\mesos-containerizer.exe
launch <POSSIBLY-SENSITIVE-DATA>'
I0323 03:24:22.521162 11256 executor.cpp:661] Forked command at 8924
I0323 03:24:22.552166  9360 exec.cpp:445] Executor asked to shutdown
I0323 03:24:22.553153  8356 executor.cpp:176] Received SHUTDOWN event
I0323 03:24:22.553153  8356 executor.cpp:758] Shutting down
I0323 03:24:22.553153  8356 executor.cpp:868] Sending SIGTERM to process tree at pid 800
I0323 03:24:22.551165  5672 slave.cpp:6566] Shutting down executor 'bfa57eef-68c4-480c-ab9e-8713821be243'
of framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-0000 at executor(1)@10.3.1.8:61938
I0323 03:24:22.551165  5672 slave.cpp:919] Agent terminating
W0323 03:24:22.552166  5672 slave.cpp:3869] Ignoring shutdown framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-0000
because it is terminating
I0323 03:24:22.553153  9020 master.cpp:10354] Removing task bfa57eef-68c4-480c-ab9e-8713821be243
with resources cpus(allocated: *):4; mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated:
*):[31000-32000] of framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-0000 on agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
at slave(412)@10.3.1.8:61917 (winbldsrv-01.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0323 03:24:22.555380  5672 containerizer.cpp:2338] Destroying container 938fb382-e91e-4721-976a-261b48c06e63
in RUNNING state
I0323 03:24:22.556154  5672 containerizer.cpp:2952] Transitioning the state of container 938fb382-e91e-4721-976a-261b48c06e63
from RUNNING to DESTROYING
I0323 03:24:22.557157  9020 master.cpp:1295] Agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
at slave(412)@10.3.1.8:61917 (winbldsrv-01.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
disconnected
I0323 03:24:22.557157  9020 master.cpp:3283] Disconnecting agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
at slave(412)@10.3.1.8:61917 (winbldsrv-01.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0323 03:24:22.557157  9020 master.cpp:3302] Deactivating agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
at slave(412)@10.3.1.8:61917 (winbldsrv-01.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0323 03:24:22.558156  5672 launcher.cpp:156] Asked to destroy container 938fb382-e91e-4721-976a-261b48c06e63
I0323 03:24:22.558156 10564 hierarchical.cpp:344] Removed framework c5a83c33-0bff-4e06-ba68-0b94e8c26134-0000
I0323 03:24:22.558156 10564 hierarchical.cpp:766] Agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
deactivated
I0323 03:24:22.579192  8352 containerizer.cpp:2791] Container 938fb382-e91e-4721-976a-261b48c06e63
has exited
I0323 03:24:22.615169 10160 master.cpp:1137] Master terminating
I0323 03:24:22.617171  6252 hierarchical.cpp:609] Removed agent c5a83c33-0bff-4e06-ba68-0b94e8c26134-S0
I0323 03:24:23.087208  8196 process.cpp:929] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On March 23, 2018, 2:10 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66232/
> -----------------------------------------------------------
> 
> (Updated March 23, 2018, 2:10 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
>     https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The default executor uses checks to ensure that it is subscribed to the
> agent before killing a task/container; if it is not, it will skip kill
> calls or crash.
> 
> When killing a task/container, the default executor performs the
> following actions:
> 
> 1) Calls `KILL_NESTED_CONTAINER`.
> 2) Enqueues a task status update (for tasks only).
> 
> None of these actions require an active subscription.
> 
> When the checks were added:
> 
> 1) The default executor supported launching only one task group, so it
> was correct to assume that killing a task would eventually trigger the
> destruction of all the child containers.
> 2) It was assumed that the executor will only be unsubscribed for brief
> periods of time, mostly during agent recovery, when kill calls are
> likely to fail.
> 3) There was no kill/escalation retry logic.
> 
> The checks were added as a way of working around the lack of retry logic
> for kill requests, relying on the fact that crashing the executor leads
> to the destruction of all the child containers.
> 
> This chain adds retry logic for failed kill call escalation, making the
> workaround unnecessary.
> 
> Now that the default executor supports running multiple task groups, the
> checks are not just unnecessary, but also invalid and dangerous. If a
> check fails, all the containers started by the executor will be killed,
> regardless of which task group they belong to. This is bad and could
> lead to data loss.
> 
> This patch removes these unnecessary and sometimes invalid checks from
> the default executor.
> 
> 
> Diffs
> -----
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/66232/diff/1/
> 
> 
> Testing
> -------
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message