mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos Reviewbot Windows <revi...@mesos.apache.org>
Subject Re: Review Request 66799: Fixed flakyness in 'MasterAPITest.MasterFailover'.
Date Wed, 25 Apr 2018 20:13:03 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66799/#review201958
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['66799']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66799

Relevant logs:

- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66799/logs/mesos-tests-stdout.log):

```
[ RUN      ] ContentType/SchedulerTest.OperationFeedbackValidationNoResourceProviderCapability/1
[       OK ] ContentType/SchedulerTest.OperationFeedbackValidationNoResourceProviderCapability/1
(14447 ms)
[ RUN      ] ContentType/SchedulerTest.OperationFeedbackValidationSchedulerDriverFramework/0
[       OK ] ContentType/SchedulerTest.OperationFeedbackValidationSchedulerDriverFramework/0
(14384 ms)
[ RUN      ] ContentType/SchedulerTest.OperationFeedbackValidationSchedulerDriverFramework/1
[       OK ] ContentType/SchedulerTest.OperationFeedbackValidationSchedulerDriverFramework/1
(14465 ms)
[ RUN      ] ContentType/SchedulerTest.ShutdownExecutor/0
[       OK ] ContentType/SchedulerTest.ShutdownExecutor/0 (14847 ms)
[ RUN      ] ContentType/SchedulerTest.ShutdownExecutor/1
[       OK ] ContentType/SchedulerTest.ShutdownExecutor/1 (14380 ms)
[ RUN      ] ContentType/SchedulerTest.Decline/0
[       OK ] ContentType/SchedulerTest.Decline/0 (14613 ms)
[ RUN      ] ContentType/SchedulerTest.Decline/1
[       OK ] ContentType/SchedulerTest.Decline/1 (14387 ms)
[ RUN      ] ContentType/SchedulerTest.Revive/0
[       OK ] ContentType/SchedulerTest.Revive/0 (14445 ms)
[ RUN      ] ContentType/SchedulerTest.Revive/1
[       OK ] ContentType/SchedulerTest.Revive/1 (14468 ms)
[ RUN      ] ContentType/SchedulerTest.Suppress/0
[       OK ] ContentType/SchedulerTest.Suppress/0 (14470 ms)
[ RUN      ] ContentType/SchedulerTest.Suppress/1
[       OK ] ContentType/SchedulerTest.Suppress/1 (14547 ms)
[ RUN      ] ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed/0
[       OK ] ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed/0 (19723 ms)
[ RUN      ] ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed/1
[       OK ] ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed/1 (19353 ms)
[ RUN      ] ContentType/SchedulerTest.NoOffersOnReregistrationWithAllRolesSuppressed/0
[       OK ] ContentType/SchedulerTest.NoOffersOnReregistrationWithAllRolesSuppressed/0 (19847
ms)
[ RUN      ] ContentType/SchedulerTest.NoOffersOnReregistrationWithAllRolesSuppressed/1
```

- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66799/logs/mesos-tests-stderr.log):

```
I0425 20:12:45.532569  3116 hierarchical.cpp:405] Deactivated framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
W0425 20:12:45.533589 10092 master.hpp:2342] Unable to send event to framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
(default): connection closed
I0425 20:12:45.533589 10092 master.cpp:11069] Removing offer a99d1a52-29ea-44f3-8cba-ee650a7f51bf-O1
I0425 20:12:45.533589 10092 master.cpp:3236] Disconnecting framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
(default)
I0425 20:12:45.535567 10092 master.cpp:1426] Giving framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
(default) 0ns to failover
I0425 20:12:45.536577 16584 master.cpp:8935] Framework failover timeout, removing framework
a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000 (default)
I0425 20:12:45.536577 16584 master.cpp:9829] Removing framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
(default)
I0425 20:12:45.537609 16440 hierarchical.cpp:344] Removed framework a99d1a52-29ea-44f3-8cba-ee650a7f51bf-0000
I0425 20:12:45.539587 16652 slave.cpp:919] Agent terminating
I0425 20:12:45.540585  4816 master.cpp:1296] Agent a99d1a52-29ea-44f3-8cba-ee650a7f51bf-S0
at slave(419)@172.27.128.1:52517 (winbldsrv-02) disconnected
I0425 20:12:45.540585  4816 master.cpp:3296] Disconnecting agent a99d1a52-29ea-44f3-8cba-ee650a7f51bf-S0
at slave(419)@172.27.128.1:52517 (winbldsrv-02)
I0425 20:12:45.540585  4816 master.cpp:3315] Deactivating agent a99d1a52-29ea-44f3-8cba-ee650a7f51bf-S0
at slave(419)@172.27.128.1:52517 (winbldsrv-02)
I0425 20:12:45.541589 12936 hierarchical.cpp:766] Agent a99d1a52-29ea-44f3-8cba-ee650a7f51bf-S0
deactivated
I0425 20:12:45.568608 16652 master.cpp:1138] Master terminating
I0425 20:12:45.570572 16268 hierarchical.cpp:609] Removed agent a99d1a52-29ea-44f3-8cba-ee650a7f51bf-S0
I0425 20:12:45.608578 16652 cluster.cpp:172] Creating default 'local' authorizer
I0425 20:12:50.454778  4632 master.cpp:463] Master 50a48ecc-260c-4227-8e71-f6a2d78250fc (winbldsrv-02)
started on 172.27.128.1:52517
I0425 20:12:50.454778  4632 master.cpp:466] Flags at startup: --acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="C:\Users\mesos\AppData\Local\Temp\0JklLJ\credentials"
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true"
--http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true"
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5"
--max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000"
--memory_profiling="false" --port="5050" --quiet="false" --recovery_agent_removal_limit="1
 00%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins"
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs"
--registry_strict="false" --require_agent_domain="false" --root_submissions="true" --user_sorter="drf"
--version="false" --webui_dir="/webui" --work_dir="C:\Users\mesos\AppData\Local\Temp\0JklLJ\master"
--zk_session_timeout="10secs"
I0425 20:12:50.457751  4632 master.cpp:515] Master only allowing authenticated frameworks
to register
I0425 20:12:50.457751  4632 master.cpp:521] Master only allowing authenticated agents to register
I0425 20:12:50.458745  4632 master.cpp:527] Master only allowing authenticated HTTP frameworks
to register
I0425 20:12:50.458745  4632 credentials.hpp:37] Loading credentials for authentication from
'C:\Users\mesos\AppData\Local\Temp\0JklLJ\credentials'
I0425 20:12:50.459760  4632 master.cpp:571] Using default 'crammd5' authenticator
I0425 20:12:50.460759  4632 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-readonly'
I0425 20:12:50.461760  4632 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-readwrite'
I0425 20:12:50.461760  4632 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-scheduler'
I0425 20:12:50.462759  4632 master.cpp:652] Authorization enabled
I0425 20:12:50.474777  3116 master.cpp:2127] Elected as the leading master!
I0425 20:12:50.474777  3116 master.cpp:1683] Recovering from registrar
I0425 20:12:50.4757```

- Mesos Reviewbot Windows


On April 25, 2018, 4:46 p.m., Benno Evers wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66799/
> -----------------------------------------------------------
> 
> (Updated April 25, 2018, 4:46 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8687
>     https://issues.apache.org/jira/browse/MESOS-8687
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test used to be sporadically segfault as described in MESOS-8687.
> The suspected cause is that a in a master actor, the `httpSequence`
> field was lazily initialized in `ProcessBase::consume()` and afterwards
> a call to `ProcessBase::_consume()` was dispatched, where it was
> assumed that `httpSequence` is already initialized.
> 
> However, during this test the master actor would be destroyed and a
> new actor would be spawned with the same PID. The dispatched method
> would be called on this new actor and find `httpSequence` to be not
> initialized, leading to a crash.
> 
> This patch introduces a call to `Clock::settle()` after the master
> is shut down to ensure the outstanding `_consume()` gets discarded
> before starting the new master actor.
> 
> 
> Diffs
> -----
> 
>   src/tests/api_tests.cpp dd8e221d8fd1b2a241505345337897e4ee4a6347 
> 
> 
> Diff: https://reviews.apache.org/r/66799/diff/1/
> 
> 
> Testing
> -------
> 
> `./src/mesos-tests --gtest_filter="*MasterAPITest*MasterFailover*" --gtest_repeat=100
--gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message