mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gastón Kleiman <gas...@mesosphere.com>
Subject Re: Review Request 55901: [WIP] Added support for command health checks to the default executor.
Date Wed, 25 Jan 2017 17:24:39 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55901/#review162972
-----------------------------------------------------------



It looks like there's a deadlock in the cleanup path of the Agent/Containerizer that is only manifested under OS X. Here's the output of a failed test run on OS X:

```
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from HealthCheckTest
[ RUN      ] HealthCheckTest.DefaultExecutorCmdHealthCheck
I0125 14:43:57.005580 3095696320 cluster.cpp:160] Creating default 'local' authorizer
I0125 14:43:57.036818 31969280 master.cpp:383] Master 41942215-2f76-4ffa-9bfc-383fc4bc2b04 (172.18.9.98) started on 172.18.9.98:49526
I0125 14:43:57.036867 31969280 master.cpp:385] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_
 timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/Users/gaston/mesos/master/share/mesos/webui" --work_dir="/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/master" --zk_session_timeout="10secs"
I0125 14:43:57.038107 31969280 master.cpp:435] Master only allowing authenticated frameworks to register
I0125 14:43:57.038125 31969280 master.cpp:449] Master only allowing authenticated agents to register
I0125 14:43:57.038132 31969280 master.cpp:462] Master only allowing authenticated HTTP frameworks to register
I0125 14:43:57.038138 31969280 credentials.hpp:37] Loading credentials for authentication from '/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/credentials'
I0125 14:43:57.038478 31969280 master.cpp:507] Using default 'crammd5' authenticator
I0125 14:43:57.038516 31969280 authenticator.cpp:519] Initializing server SASL
I0125 14:43:57.056694 31969280 auxprop.cpp:73] Initialized in-memory auxiliary property plugin
I0125 14:43:57.056784 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
I0125 14:43:57.056943 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readwrite'
I0125 14:43:57.057091 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-scheduler'
I0125 14:43:57.057209 31969280 master.cpp:587] Authorization enabled
I0125 14:43:57.057353 29286400 hierarchical.cpp:151] Initialized hierarchical allocator process
I0125 14:43:57.057349 31432704 whitelist_watcher.cpp:77] No whitelist given
I0125 14:43:57.060212 33042432 master.cpp:2121] Elected as the leading master!
I0125 14:43:57.060230 33042432 master.cpp:1643] Recovering from registrar
I0125 14:43:57.060317 31432704 registrar.cpp:329] Recovering registrar
I0125 14:43:57.064963 31432704 registrar.cpp:362] Successfully fetched the registry (0B) in 4.62208ms
I0125 14:43:57.065105 31432704 registrar.cpp:461] Applied 1 operations in 36us; attempting to update the registry
I0125 14:43:57.069882 31969280 registrar.cpp:506] Successfully updated the registry in 4.75008ms
I0125 14:43:57.069962 31969280 registrar.cpp:392] Successfully recovered registrar
I0125 14:43:57.070274 30896128 master.cpp:1759] Recovered 0 agents from the registry (131B); allowing 10mins for agents to re-register
I0125 14:43:57.070297 29286400 hierarchical.cpp:178] Skipping recovery of hierarchical allocator: nothing to recover
I0125 14:43:57.071858 3095696320 containerizer.cpp:220] Using isolation: posix/cpu,posix/mem,filesystem/posix
I0125 14:43:57.074162 3095696320 cluster.cpp:446] Creating default 'local' authorizer
I0125 14:43:57.074981 33042432 slave.cpp:209] Mesos agent started on (1)@172.18.9.98:49526
I0125 14:43:57.075031 33042432 slave.cpp:210] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false
 " --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_credentials="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/http_credentials" --http_heartbeat_interval="30secs" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="posix" --launcher_dir="/Users/gaston/mesosphere/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_
 timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --runtime_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --version="false" --work_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg"
I0125 14:43:57.075314 33042432 credentials.hpp:86] Loading credential for authentication from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/credential'
I0125 14:43:57.075451 33042432 slave.cpp:352] Agent using credential for: test-principal
I0125 14:43:57.075469 33042432 credentials.hpp:37] Loading credentials for authentication from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/http_credentials'
I0125 14:43:57.075603 33042432 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readonly'
I0125 14:43:57.076303 33042432 slave.cpp:539] Agent resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0125 14:43:57.076386 33042432 slave.cpp:547] Agent attributes: [  ]
I0125 14:43:57.076395 33042432 slave.cpp:552] Agent hostname: 172.18.9.98
I0125 14:43:57.076504 32505856 status_update_manager.cpp:177] Pausing sending status updates
I0125 14:43:57.076628 3095696320 sched.cpp:232] Version: 1.2.0
I0125 14:43:57.076966 29286400 sched.cpp:336] New master detected at master@172.18.9.98:49526
I0125 14:43:57.077011 29286400 sched.cpp:407] Authenticating with master master@172.18.9.98:49526
I0125 14:43:57.077033 29286400 sched.cpp:414] Using default CRAM-MD5 authenticatee
I0125 14:43:57.077142 29822976 authenticatee.cpp:97] Initializing client SASL
I0125 14:43:57.077421 29286400 state.cpp:60] Recovering state from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/meta'
I0125 14:43:57.077636 33042432 status_update_manager.cpp:203] Recovering status update manager
I0125 14:43:57.077785 31969280 containerizer.cpp:599] Recovering containerizer
I0125 14:43:57.078785 31432704 provisioner.cpp:251] Provisioner recovery complete
I0125 14:43:57.078979 31432704 slave.cpp:5422] Finished recovery
I0125 14:43:57.079376 31432704 slave.cpp:5596] Querying resource estimator for oversubscribable resources
I0125 14:43:57.079521 33042432 status_update_manager.cpp:177] Pausing sending status updates
I0125 14:43:57.079525 31432704 slave.cpp:929] New master detected at master@172.18.9.98:49526
I0125 14:43:57.079567 31432704 slave.cpp:964] Detecting new master
I0125 14:43:57.079654 31432704 slave.cpp:5610] Received oversubscribable resources {} from the resource estimator
I0125 14:43:57.079880 29822976 authenticatee.cpp:121] Creating new client SASL connection
I0125 14:43:57.080003 29286400 master.cpp:7001] Authenticating scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.080070 30359552 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(1)@172.18.9.98:49526
I0125 14:43:57.080181 31432704 authenticator.cpp:98] Creating new server SASL connection
I0125 14:43:57.080265 31969280 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0125 14:43:57.080309 31969280 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0125 14:43:57.080360 33042432 authenticator.cpp:204] Received SASL authentication start
I0125 14:43:57.080399 33042432 authenticator.cpp:326] Authentication requires more steps
I0125 14:43:57.080458 33042432 authenticatee.cpp:259] Received SASL authentication step
I0125 14:43:57.080544 31432704 authenticator.cpp:232] Received SASL authentication step
I0125 14:43:57.080564 31432704 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
I0125 14:43:57.080580 31432704 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0125 14:43:57.080605 31432704 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0125 14:43:57.080621 31432704 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
I0125 14:43:57.080633 31432704 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0125 14:43:57.080637 31432704 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0125 14:43:57.080646 31432704 authenticator.cpp:318] Authentication success
I0125 14:43:57.080687 29286400 authenticatee.cpp:299] Authentication success
I0125 14:43:57.080730 30359552 master.cpp:7031] Successfully authenticated principal 'test-principal' at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.080780 32505856 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(1)@172.18.9.98:49526
I0125 14:43:57.080895 31969280 sched.cpp:513] Successfully authenticated with master master@172.18.9.98:49526
I0125 14:43:57.080906 31969280 sched.cpp:836] Sending SUBSCRIBE call to master@172.18.9.98:49526
I0125 14:43:57.080955 31969280 sched.cpp:869] Will retry registration in 788.765853ms if necessary
I0125 14:43:57.081032 30359552 master.cpp:2734] Received SUBSCRIBE call for framework 'default' at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.081043 30359552 master.cpp:2157] Authorizing framework principal 'test-principal' to receive offers for role '*'
I0125 14:43:57.081260 29286400 master.cpp:2810] Subscribing framework default with checkpointing disabled and capabilities [  ]
I0125 14:43:57.088062 31969280 slave.cpp:991] Authenticating with master master@172.18.9.98:49526
I0125 14:43:57.096890 31969280 slave.cpp:1002] Using default CRAM-MD5 authenticatee
I0125 14:43:57.097005 29822976 authenticatee.cpp:121] Creating new client SASL connection
I0125 14:43:57.097234 29286400 master.cpp:7001] Authenticating slave(1)@172.18.9.98:49526
I0125 14:43:57.097245 33042432 hierarchical.cpp:271] Added framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.097241 31969280 sched.cpp:759] Framework registered with 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.097296 30896128 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(2)@172.18.9.98:49526
I0125 14:43:57.097301 33042432 hierarchical.cpp:1677] No allocations performed
I0125 14:43:57.097332 33042432 hierarchical.cpp:1772] No inverse offers to send out!
I0125 14:43:57.097352 33042432 hierarchical.cpp:1279] Performed allocation for 0 agents in 94us
I0125 14:43:57.097369 31969280 sched.cpp:773] Scheduler::registered took 108us
I0125 14:43:57.097446 29286400 authenticator.cpp:98] Creating new server SASL connection
I0125 14:43:57.097564 31432704 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0125 14:43:57.097605 31432704 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0125 14:43:57.097694 31969280 authenticator.cpp:204] Received SASL authentication start
I0125 14:43:57.097761 31969280 authenticator.cpp:326] Authentication requires more steps
I0125 14:43:57.097836 30359552 authenticatee.cpp:259] Received SASL authentication step
I0125 14:43:57.097904 29822976 authenticator.cpp:232] Received SASL authentication step
I0125 14:43:57.097934 29822976 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
I0125 14:43:57.097944 29822976 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0125 14:43:57.097965 29822976 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0125 14:43:57.097980 29822976 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
I0125 14:43:57.097990 29822976 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0125 14:43:57.097997 29822976 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0125 14:43:57.098006 29822976 authenticator.cpp:318] Authentication success
I0125 14:43:57.098071 30359552 authenticatee.cpp:299] Authentication success
I0125 14:43:57.098121 29286400 master.cpp:7031] Successfully authenticated principal 'test-principal' at slave(1)@172.18.9.98:49526
I0125 14:43:57.098161 31969280 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(2)@172.18.9.98:49526
I0125 14:43:57.098255 31432704 slave.cpp:1086] Successfully authenticated with master master@172.18.9.98:49526
I0125 14:43:57.098340 31432704 slave.cpp:1508] Will retry registration in 15.661984ms if necessary
I0125 14:43:57.098412 29822976 master.cpp:5299] Registering agent at slave(1)@172.18.9.98:49526 (172.18.9.98) with id 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0
I0125 14:43:57.098567 33042432 registrar.cpp:461] Applied 1 operations in 35us; attempting to update the registry
I0125 14:43:57.103466 33042432 registrar.cpp:506] Successfully updated the registry in 4.877056ms
I0125 14:43:57.103818 33042432 slave.cpp:4286] Received ping from slave-observer(1)@172.18.9.98:49526
I0125 14:43:57.103868 30896128 master.cpp:5370] Registered agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0125 14:43:57.103914 33042432 slave.cpp:1132] Registered with master master@172.18.9.98:49526; given agent ID 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0
I0125 14:43:57.103934 33042432 fetcher.cpp:90] Clearing fetcher cache
I0125 14:43:57.104035 31969280 hierarchical.cpp:478] Added agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 (172.18.9.98) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: {})
I0125 14:43:57.104058 30359552 status_update_manager.cpp:184] Resuming sending status updates
I0125 14:43:57.104224 33042432 slave.cpp:1160] Checkpointing SlaveInfo to '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/meta/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/slave.info'
I0125 14:43:57.104449 31969280 hierarchical.cpp:1772] No inverse offers to send out!
I0125 14:43:57.104480 31969280 hierarchical.cpp:1302] Performed allocation for agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 in 439us
I0125 14:43:57.104595 33042432 slave.cpp:1198] Forwarding total oversubscribed resources {}
I0125 14:43:57.104617 31432704 master.cpp:6830] Sending 1 offers to framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.104727 31432704 master.cpp:5863] Received update of agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) with total oversubscribed resources {}
I0125 14:43:57.104867 29286400 hierarchical.cpp:548] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 (172.18.9.98) updated with oversubscribed resources {} (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I0125 14:43:57.104959 29286400 hierarchical.cpp:1677] No allocations performed
I0125 14:43:57.104972 29286400 hierarchical.cpp:1772] No inverse offers to send out!
I0125 14:43:57.104961 32505856 sched.cpp:933] Scheduler::resourceOffers took 173us
I0125 14:43:57.104995 29286400 hierarchical.cpp:1302] Performed allocation for agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 in 74us
I0125 14:43:57.106262 29286400 master.cpp:3728] Processing ACCEPT call for offers: [ 41942215-2f76-4ffa-9bfc-383fc4bc2b04-O0 ] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.106317 29286400 master.cpp:3316] Authorizing framework principal 'test-principal' to launch task 1
I0125 14:43:57.107606 32505856 master.cpp:8882] Adding task 1 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.107693 32505856 master.cpp:4583] Launching task group { 1 } of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.107838 30359552 slave.cpp:1576] Got assigned task group containing tasks [ 1 ] for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.108158 30359552 slave.cpp:1736] Launching task group containing tasks [ 1 ] for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.112789 30359552 paths.cpp:547] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746' to user 'gaston'
I0125 14:43:57.113065 30359552 slave.cpp:6331] Launching executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 with resources cpus(*):0.1; mem(*):32; disk(*):32 in work directory '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746'
I0125 14:43:57.113381 30896128 containerizer.cpp:992] Starting container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 for executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.113417 30359552 slave.cpp:2058] Queued task group containing tasks [ 1 ] for executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.113462 30359552 slave.cpp:882] Successfully attached file '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746'
I0125 14:43:57.115821 31969280 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"arguments":["mesos-default-executor","--launcher_dir=\/Users\/gaston\/mesosphere\/mesos\/build\/src"],"shell":false,"value":"\/Users\/gaston\/mesosphere\/mesos\/build\/src\/mesos-default-executor"},"environment":{"variables":[{"name":"LIBPROCESS_PORT","value":"0"},{"name":"MESOS_AGENT_ENDPOINT","value":"172.18.9.98:49526"},{"name":"MESOS_CHECKPOINT","value":"0"},{"name":"MESOS_DIRECTORY","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"},{"name":"MESOS_EXECUTOR_ID","value":"default"},{"name":"MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD","value":"5secs"},{"name":"MESOS_FRAMEWORK_ID","value":"41942215-2f76-4ffa-9bfc-383fc
 4bc2b04-0000"},{"name":"MESOS_HTTP_COMMAND_EXECUTOR","value":"0"},{"name":"MESOS_SLAVE_ID","value":"41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0"},{"name":"MESOS_SLAVE_PID","value":"slave(1)@172.18.9.98:49526"},{"name":"MESOS_SANDBOX","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"}]},"err":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/stderr","type":"PATH"},"out":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4b
 c2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/stdout","type":"PATH"},"user":"gaston","working_directory":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"}" --pipe_read="6" --pipe_write="8" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"'
I0125 14:43:57.117321 31969280 launcher.cpp:135] Forked child with pid '85059' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746'
I0125 14:43:57.119642 30896128 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746
I0125 14:43:57.222311 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor'
I0125 14:43:57.223127 29822976 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49528
I0125 14:43:57.223256 29822976 slave.cpp:3109] Received Subscribe request for HTTP executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.224218 30896128 slave.cpp:2300] Sending queued task group task group containing tasks [ 1 ] to executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP)
I0125 14:43:57.237958 30896128 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.238396 30896128 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49530
I0125 14:43:57.238607 30896128 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER
I0125 14:43:57.239876 29822976 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1
I0125 14:43:57.240108 29822976 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1' to user 'gaston'
I0125 14:43:57.241133 32505856 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"sleep 120"},"environment":{"variables":[{"name":"MESOS_SANDBOX","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"}]},"err":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1\/stderr","type":"PATH"},"out":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/
 T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1\/stdout","type":"PATH"},"user":"gaston","working_directory":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"}" --pipe_read="10" --pipe_write="11" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"'
I0125 14:43:57.251711 32505856 launcher.cpp:135] Forked child with pid '85088' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1'
I0125 14:43:57.253597 33042432 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1
E0125 14:43:57.257318 33579008 process.cpp:2419] Failed to shutdown socket with fd 9: Socket is not connected
I0125 14:43:57.257848 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor'
I0125 14:43:57.258191 29286400 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.258460 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49529
I0125 14:43:57.258599 29286400 slave.cpp:3756] Handling status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.258846 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49531
I0125 14:43:57.259006 29286400 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.259308 29286400 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION
I0125 14:43:57.259663 30359552 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.259696 30359552 status_update_manager.cpp:500] Creating StatusUpdate stream for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.259893 29822976 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.259938 30359552 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to the agent
I0125 14:43:57.259951 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49532
I0125 14:43:57.260105 29822976 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check' to user 'gaston'
I0125 14:43:57.260159 29286400 slave.cpp:4196] Forwarding the update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to master@172.18.9.98:49526
I0125 14:43:57.260329 29286400 http.cpp:464] Processing call WAIT_NESTED_CONTAINER
I0125 14:43:57.260387 31432704 master.cpp:6008] Status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 from agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.260434 31432704 master.cpp:6076] Forwarding status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.260524 29286400 slave.cpp:4090] Status update manager successfully handled status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.260597 31432704 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING)
I0125 14:43:57.260751 30896128 sched.cpp:1041] Scheduler::statusUpdate took 137us
I0125 14:43:57.260968 29286400 master.cpp:5015] Processing ACKNOWLEDGE call c8583885-558e-44a3-9e0b-c39874132e27 for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0
I0125 14:43:57.261234 32505856 status_update_manager.cpp:395] Received status update acknowledgement (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.261389 31969280 slave.cpp:3045] Status update manager successfully handled status update acknowledgement (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.266468 30896128 switchboard.cpp:571] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-a719e861-62a0-42dd-9a18-8af6f13b856f" --stderr_from_fd="15" --stderr_to_fd="2" --stdin_to_fd="12" --stdout_from_fd="13" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.267304 30896128 switchboard.cpp:601] Created I/O switchboard server (pid: 85095) listening on socket file '/tmp/mesos-io-switchboard-a719e861-62a0-42dd-9a18-8af6f13b856f' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.268826 29822976 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"exit $STATUS"},"environment":{"variables":[{"name":"STATUS","value":"0"}]},"err":{"fd":16,"type":"FD"},"in":{"fd":9,"type":"FD"},"out":{"fd":14,"type":"FD"},"user":"gaston"}" --pipe_read="12" --pipe_write="13" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check"'
I0125 14:43:57.269855 29822976 launcher.cpp:135] Forked child with pid '85099' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check'
I0125 14:43:57.271919 33042432 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.330857 32505856 http.cpp:2647] Received EOF attach response for ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.331059 29286400 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check in RUNNING state
W0125 14:43:57.331065 32505856 http.cpp:2662] Launch nested container session connection for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check closed
I0125 14:43:57.331297 29286400 launcher.cpp:151] Asked to destroy container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.332167 30896128 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.332547 31969280 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49534
I0125 14:43:57.332726 31969280 http.cpp:464] Processing call WAIT_NESTED_CONTAINER
I0125 14:43:57.366807 33042432 switchboard.cpp:873] I/O switchboard server process for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check has terminated (status=0)
I0125 14:43:57.367132 31432704 containerizer.cpp:2482] Container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check has exited
I0125 14:43:57.367938 31969280 provisioner.cpp:322] Ignoring destroy request for unknown container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.368036 33042432 containerizer.cpp:2398] Checkpointing termination state to nested container's runtime directory '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check/termination'
I0125 14:43:57.387104 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.387472 29822976 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49535
I0125 14:43:57.387668 29822976 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION
I0125 14:43:57.387924 33042432 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.387951 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor'
I0125 14:43:57.388016 33042432 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check' to user 'gaston'
I0125 14:43:57.388273 31432704 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49529
I0125 14:43:57.388356 31432704 slave.cpp:3756] Handling status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.389214 29822976 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.389328 29822976 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to the agent
I0125 14:43:57.389411 30896128 slave.cpp:4196] Forwarding the update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to master@172.18.9.98:49526
I0125 14:43:57.389511 30896128 slave.cpp:4090] Status update manager successfully handled status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.389554 32505856 master.cpp:6008] Status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 from agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.389578 32505856 master.cpp:6076] Forwarding status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.389693 32505856 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING)
I0125 14:43:57.389803 31432704 sched.cpp:1041] Scheduler::statusUpdate took 76us
I0125 14:43:57.390027 3095696320 sched.cpp:2021] Asked to stop the driver
I0125 14:43:57.390022 29822976 master.cpp:5015] Processing ACKNOWLEDGE call 0a69f89b-35c9-4943-a6b4-2f9646b11338 for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0
I0125 14:43:57.390105 31969280 sched.cpp:1203] Stopping framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.390228 31432704 master.cpp:7543] Processing TEARDOWN call for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.390235 30359552 status_update_manager.cpp:395] Received status update acknowledgement (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.390249 31432704 master.cpp:7555] Removing framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.390262 31432704 master.cpp:3078] Deactivating framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526
I0125 14:43:57.390390 32505856 hierarchical.cpp:386] Deactivated framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.390399 31969280 slave.cpp:3045] Status update manager successfully handled status update acknowledgement (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.390424 31432704 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_KILLED, status update state: TASK_KILLED)
I0125 14:43:57.390444 31969280 slave.cpp:2598] Asked to shut down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 by master@172.18.9.98:49526
I0125 14:43:57.390458 31969280 slave.cpp:2623] Shutting down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.390477 31969280 slave.cpp:5005] Shutting down executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP)
I0125 14:43:57.390702 31432704 master.cpp:8232] Removing task 1 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.390861 31432704 master.cpp:8261] Removing executor 'default' with resources cpus(*):0.1; mem(*):32; disk(*):32 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:43:57.390902 33042432 hierarchical.cpp:1011] Recovered cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):0.1; mem(*):32; disk(*):32) on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 from framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.391154 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 in RUNNING state
I0125 14:43:57.391171 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1 in RUNNING state
I0125 14:43:57.391186 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check in PREPARING state
I0125 14:43:57.391238 29822976 hierarchical.cpp:1011] Recovered cpus(*):0.1; mem(*):32; disk(*):32 (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: {}) on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 from framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.391365 29822976 hierarchical.cpp:337] Removed framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000
I0125 14:43:57.391458 32505856 containerizer.cpp:2190] Waiting for the isolators to complete preparing before destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.394384 31432704 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1'
I0125 14:43:57.394676 30896128 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49536
I0125 14:43:57.394672 29286400 switchboard.cpp:571] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa" --stderr_from_fd="15" --stderr_to_fd="2" --stdin_to_fd="12" --stdout_from_fd="13" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
I0125 14:43:57.394807 30896128 http.cpp:464] Processing call KILL_NESTED_CONTAINER
I0125 14:43:57.395575 29286400 switchboard.cpp:601] Created I/O switchboard server (pid: 85132) listening on socket file '/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check
W0125 14:43:57.396780 31432704 http.cpp:2226] Failed to launch nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check: Container is being destroyed during preparing
I0125 14:43:57.396934 30896128 process.cpp:3754] Failed to process request for '/slave(1)/api/v1': Container is being destroyed during preparing
I0125 14:43:57.396984 31432704 process.cpp:1456] Returning '500 Internal Server Error' for '/slave(1)/api/v1' (Container is being destroyed during preparing)
I0125 14:43:58.061504 30359552 hierarchical.cpp:1677] No allocations performed
I0125 14:43:58.061565 30359552 hierarchical.cpp:1279] Performed allocation for 1 agents in 189us
I0125 14:43:59.067081 30896128 hierarchical.cpp:1677] No allocations performed
I0125 14:43:59.067144 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 206us
I0125 14:44:00.068758 33042432 hierarchical.cpp:1677] No allocations performed
I0125 14:44:00.068822 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 182us
I0125 14:44:01.070752 31969280 hierarchical.cpp:1677] No allocations performed
I0125 14:44:01.070814 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 171us
I0125 14:44:02.071462 30896128 hierarchical.cpp:1677] No allocations performed
I0125 14:44:02.071526 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 175us
I0125 14:44:02.393194 32505856 slave.cpp:5078] Killing executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP)
E0125 14:44:02.394971 33579008 process.cpp:2419] Failed to shutdown socket with fd 11: Socket is not connected
E0125 14:44:02.395092 33579008 process.cpp:2419] Failed to shutdown socket with fd 18: Socket is not connected
E0125 14:44:02.395231 33579008 process.cpp:2419] Failed to shutdown socket with fd 6: Socket is not connected
E0125 14:44:02.395437 33579008 process.cpp:2419] Failed to shutdown socket with fd 8: Socket is not connected
I0125 14:44:02.397722 33579008 switchboard.cpp:782] Sending SIGTERM to I/O switchboard server (pid: 85132) since container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check is being destroyed
I0125 14:44:02.431910 33042432 containerizer.cpp:2482] Container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 has exited
I0125 14:44:03.074764 29286400 hierarchical.cpp:1677] No allocations performed
I0125 14:44:03.074825 29286400 hierarchical.cpp:1279] Performed allocation for 1 agents in 223us
I0125 14:44:04.080370 29822976 hierarchical.cpp:1677] No allocations performed
I0125 14:44:04.080440 29822976 hierarchical.cpp:1279] Performed allocation for 1 agents in 190us
I0125 14:44:05.080876 30896128 hierarchical.cpp:1677] No allocations performed
I0125 14:44:05.080950 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 197us
I0125 14:44:06.082023 30896128 hierarchical.cpp:1677] No allocations performed
I0125 14:44:06.082082 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 186us
I0125 14:44:07.086395 29286400 hierarchical.cpp:1677] No allocations performed
I0125 14:44:07.086458 29286400 hierarchical.cpp:1279] Performed allocation for 1 agents in 226us
I0125 14:44:08.089059 33042432 hierarchical.cpp:1677] No allocations performed
I0125 14:44:08.089119 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 168us
I0125 14:44:09.090726 31969280 hierarchical.cpp:1677] No allocations performed
I0125 14:44:09.090795 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 205us
I0125 14:44:10.096534 33042432 hierarchical.cpp:1677] No allocations performed
I0125 14:44:10.096603 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 243us
I0125 14:44:11.098918 30359552 hierarchical.cpp:1677] No allocations performed
I0125 14:44:11.098978 30359552 hierarchical.cpp:1279] Performed allocation for 1 agents in 175us
I0125 14:44:12.080423 31432704 slave.cpp:5596] Querying resource estimator for oversubscribable resources
I0125 14:44:12.080592 30359552 slave.cpp:5610] Received oversubscribable resources {} from the resource estimator
I0125 14:44:12.099279 31969280 hierarchical.cpp:1677] No allocations performed
I0125 14:44:12.099328 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 180us
I0125 14:44:12.104923 31969280 slave.cpp:4286] Received ping from slave-observer(1)@172.18.9.98:49526
../../src/tests/cluster.cpp:570: Failure
Failed to wait 15secs for wait
I0125 14:44:12.395627 3095696320 slave.cpp:801] Agent terminating
I0125 14:44:12.395679 3095696320 slave.cpp:2598] Asked to shut down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 by @0.0.0.0:0
W0125 14:44:12.395700 3095696320 slave.cpp:2619] Ignoring shutdown framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 because it is terminating
I0125 14:44:12.395988 31969280 master.cpp:1261] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) disconnected
I0125 14:44:12.396028 31969280 master.cpp:3115] Disconnecting agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:44:12.396126 31969280 master.cpp:3134] Deactivating agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98)
I0125 14:44:12.396370 32505856 hierarchical.cpp:577] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 deactivated
I0125 14:44:12.401453 29822976 master.cpp:1100] Master terminating
I0125 14:44:12.401763 31432704 hierarchical.cpp:510] Removed agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0
[  FAILED  ] HealthCheckTest.DefaultExecutorCmdHealthCheck (15404 ms)
[----------] 1 test from HealthCheckTest (15407 ms total)

[----------] Global test environment tear-down
../../src/tests/environment.cpp:843: Failure
Failed
Tests completed with child processes remaining:
-+- 85038 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-tests --gtest_filter=HealthCheckTest.DefaultExecutorCmd* --break-on-error --verbose
 |-+- 85088 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-containerizer launch
 | --- 85130 sleep 120
 --- 85132 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-io-switchboard --heartbeat_interval=30secs --help=false --socket_address=/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa --stderr_from_fd=15 --stderr_to_fd=2 --stdin_to_fd=12 --stdout_from_fd=13 --stdout_to_fd=1 --tty=false --wait_for_connection=true
[==========] 1 test from 1 test case ran. (15418 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] HealthCheckTest.DefaultExecutorCmdHealthCheck

 1 FAILED TEST
```

- Gastón Kleiman


On Jan. 25, 2017, 12:33 a.m., Gastón Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55901/
> -----------------------------------------------------------
> 
> (Updated Jan. 25, 2017, 12:33 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Anand Mazumdar, haosdent huang, and Vinod Kone.
> 
> 
> Bugs: MESOS-6280
>     https://issues.apache.org/jira/browse/MESOS-6280
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> [WIP] Added support for command health checks to the default executor.
> 
> This is still a work in progress, but I am posting it in order to get some feedback on the general approach.
> 
> Open questions/issues:
> 
>   - The code that handles communication with the Agent is on the `DefaultExecutor`, so the health checker calls `evolve` when serializing the API calls. Should it use the v1 protos instead, and skip the `evolve` call?
>   - CMD health checks for command tasks inherit the env from the executor, but allow overriding env variables. I don't think that inheriting the executor env makes sense for cmd health checks launched by the `DefaultExecutor` - each task can have its own env, so the health check command should inherit the task env and merge it with the one in the check's `CommandInfo`.
>     I added some code that takes the task env from `TaskInfo` and merges it check's env, but I don't think that this is an ideal approach, since it will miss env variables set/modified by containerizers/hooks/isolators. Do you think that it would make sense for all Debug Containers to inherit the environment of the parent container?
>   - There's a TODO for stopping/resuming health checking when the `DefaultExecutor` is disconnected from the agent. After talking to AlexR, I believe that we should do this for all types of checks, not just for `CMD`.
>   - The test that I introduced passes on Linux, but not on macOS, I still have to do some more debugging.
> 
> 
> Diffs
> -----
> 
>   src/checks/health_checker.hpp 6e558f2061a9e31157c47d31cb64b3a8568aace3 
>   src/checks/health_checker.cpp 50aa2858e807b27bbab58a3618f5200cfe4eca9e 
>   src/launcher/default_executor.cpp a03794934adb93868734f8cf00b337a1bff9b5ab 
>   src/tests/health_check_tests.cpp debbd3c09b7555145aaf3f62a24d795d1423a269 
> 
> Diff: https://reviews.apache.org/r/55901/diff/
> 
> 
> Testing
> -------
> 
> Introduced a new test: `HealthCheckTest.DefaultExecutorCmdHealthCheck`. It passes on Linux, but not on macOS.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message