mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos Reviewbot Windows <revi...@mesos.apache.org>
Subject Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.
Date Mon, 08 Jan 2018 23:23:56 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194999
-----------------------------------------------------------



FAIL: Some Mesos tests failed.

Reviews applied: `['64940']`

Failed command: `D:\DCOS\mesos\src\mesos-tests.exe --verbose`

All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64940

Relevant logs:

- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64940/logs/mesos-tests-stdout.log):

```
[       OK ] Endpoint/SlaveEndpointTest.NoAuthorizer/2 (102 ms)
[----------] 9 tests from Endpoint/SlaveEndpointTest (971 ms total)

[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN      ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[       OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (33 ms)
[ RUN      ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[       OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (38 ms)
[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (73 ms total)

[----------] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN      ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[       OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2429 ms)
[----------] 1 test from IsolationFlag/CpuIsolatorTest (2454 ms total)

[----------] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN      ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[       OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2391 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (2416 ms total)

[----------] Global test environment tear-down
[==========] 847 tests from 85 test cases ran. (321089 ms total)
[  PASSED  ] 846 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] PartitionTest.AgentWithCompletedTaskPartitioned

 1 FAILED TEST
  YOU HAVE 213 DISABLED TESTS

```

- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64940/logs/mesos-tests-stderr.log):

```
I0108 23:23:35.775553  9436 executor.cpp:171] Received SUBSCRIBED event
I0108 23:23:35.779811  9436 executor.cpp:175] Subscribed executor on build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0108 23:23:35.780552  9436 executor.cpp:171] Received LAUNCH event
I0108 23:23:35.783553  9436 executor.cpp:638] Starting task 3f37b88a-3621-4109-8069-194fca55ecec
I0108 23:23:35.858552  9436 executor.cpp:478] Running 'D:\DCOS\mesos\src\mesos-containerizer.exe
launch <POSSIBLY-SENSITIVE-DATA>'
I0108 23:23:36.412520  9436 executor.cpp:651] Forked command at 2800
I0108 23:23:36.441519  9252 exec.cpp:445] Executor asked to shutdown
I0108 23:23:36.441519  9436 executor.cpp:171] Received SHUTDOWN event
I0108 23:23:36.441519  9436 executor.cpp:748] Shutting down
I0108 23:23:36.441519  9436 executor.cpp:863] Sending SIGTERM to process tree at pid 27f76041c-8c90-4335-b40b-e60c5cfb6b86-0000
(default) at scheduler-219dcd29-9600-4631-bbbb-39cb16c974e3@10.3.1.11:55127
I0108 23:23:36.439517  9172 hierarchical.cpp:405] Deactivated framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000
I0108 23:23:36.439517  6764 master.cpp:10196] Updating the state of task 3f37b88a-3621-4109-8069-194fca55ecec
of framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000 (latest state: TASK_KILLED, status
update state: TASK_KILLED)
I0108 23:23:36.439517  9432 slave.cpp:3396] Shutting down framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000
I0108 23:23:36.439517  9432 slave.cpp:6074] Shutting down executor '3f37b88a-3621-4109-8069-194fca55ecec'
of framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000 at executor(1)@10.3.1.11:55149
I0108 23:23:36.440518  9432 slave.cpp:931] Agent terminating
W0108 23:23:36.440518  9432 slave.cpp:3392] Ignoring shutdown framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000
because it is terminating
I0108 23:23:36.441519  6764 master.cpp:10300] Removing task 3f37b88a-3621-4109-8069-194fca55ecec
with resources cpus(allocated: *):4; mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated:
*):[31000-32000] of framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000 on agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
at slave(329)@10.3.1.11:55127 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0108 23:23:36.443517  9432 containerizer.cpp:2352] Destroying container da7f8cdc-2135-4ca1-b3cf-e18e8fdef330
in RUNNING state
I0108 23:23:36.443517  9432 containerizer.cpp:2966] Transitioning the state of container da7f8cdc-2135-4ca1-b3cf-e18e8fdef330
from RUNNING to DESTROYING
I0108 23:23:36.444519  6764 master.cpp:1305] Agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
at slave(329)@10.3.1.11:55127 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
disconnected
I0108 23:23:36.444519  6764 master.cpp:3365] Disconnecting agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
at slave(329)@10.3.1.11:55127 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0108 23:23:36.444519  9432 launcher.cpp:156] Asked to destroy container da7f8cdc-2135-4ca1-b3cf-e18e8fdef330
I0108 23:23:36.445523  9172 hierarchical.cpp:344] Removed framework 7f76041c-8c90-4335-b40b-e60c5cfb6b86-0000
I0108 23:23:36.445523  6764 master.cpp:3384] Deactivating agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
at slave(329)@10.3.1.11:55127 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0108 23:23:36.445523  6080 hierarchical.cpp:766] Agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
deactivated
I0108 23:23:36.500775  6764 containerizer.cpp:2805] Container da7f8cdc-2135-4ca1-b3cf-e18e8fdef330
has exited
I0108 23:23:36.529848  8624 master.cpp:1147] Master terminating
I0108 23:23:36.531819  1488 hierarchical.cpp:609] Removed agent 7f76041c-8c90-4335-b40b-e60c5cfb6b86-S0
I0108 23:23:36.946514  2940 process.cpp:887] Failed to accept socket: future discarded
```

- Mesos Reviewbot Windows


On Jan. 5, 2018, 7:37 p.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2018, 7:37 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod Kone, and Jiang
Yan Xu.
> 
> 
> Bugs: MESOS-8337
>     https://issues.apache.org/jira/browse/MESOS-8337
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If an agent is lost, we try to remove all the tasks that might
> have been lost. However, if a task is already terminal, it hasn't
> really been lost so we should not be tracking it in the framework's
> unreachable tasks list.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
>   src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
>   src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 
> 
> 
> Diff: https://reviews.apache.org/r/64940/diff/5/
> 
> 
> Testing
> -------
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message