You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Anand Mazumdar (JIRA)" <ji...@apache.org> on 2017/01/20 21:02:26 UTC

[jira] [Commented] (MESOS-6948) AgentAPITest.LaunchNestedContainerSession is flaky

    [ https://issues.apache.org/jira/browse/MESOS-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832393#comment-15832393 ] 

Anand Mazumdar commented on MESOS-6948:
---------------------------------------

This showed up with an identical stack trace on ASF CI.
{code}
[ RUN      ] ContentType/AgentAPITest.LaunchNestedContainerSession/1
I0120 20:51:26.939275 26158 cluster.cpp:160] Creating default 'local' authorizer
I0120 20:51:26.940529 26164 master.cpp:383] Master 2b435d8e-3792-458e-96ff-0ecff4b4aa54 (9a52de5d9dcd) started on 172.17.0.3:38943
I0120 20:51:26.940562 26164 master.cpp:385] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/rLq2TT/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/rLq2TT/master" --zk_session_timeout="10secs"
I0120 20:51:26.940747 26164 master.cpp:435] Master only allowing authenticated frameworks to register
I0120 20:51:26.940754 26164 master.cpp:449] Master only allowing authenticated agents to register
I0120 20:51:26.940757 26164 master.cpp:462] Master only allowing authenticated HTTP frameworks to register
I0120 20:51:26.940762 26164 credentials.hpp:37] Loading credentials for authentication from '/tmp/rLq2TT/credentials'
I0120 20:51:26.940891 26164 master.cpp:507] Using default 'crammd5' authenticator
I0120 20:51:26.940939 26164 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
I0120 20:51:26.940978 26164 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readwrite'
I0120 20:51:26.941035 26164 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-scheduler'
I0120 20:51:26.941082 26164 master.cpp:587] Authorization enabled
I0120 20:51:26.941186 26167 hierarchical.cpp:151] Initialized hierarchical allocator process
I0120 20:51:26.941215 26172 whitelist_watcher.cpp:77] No whitelist given
I0120 20:51:26.941815 26164 master.cpp:2119] Elected as the leading master!
I0120 20:51:26.941829 26164 master.cpp:1641] Recovering from registrar
I0120 20:51:26.941910 26170 registrar.cpp:329] Recovering registrar
I0120 20:51:26.942194 26173 registrar.cpp:362] Successfully fetched the registry (0B) in 175872ns
I0120 20:51:26.942227 26173 registrar.cpp:461] Applied 1 operations in 5497ns; attempting to update the registry
I0120 20:51:26.942468 26172 registrar.cpp:506] Successfully updated the registry in 222976ns
I0120 20:51:26.942519 26172 registrar.cpp:392] Successfully recovered registrar
I0120 20:51:26.942750 26164 hierarchical.cpp:178] Skipping recovery of hierarchical allocator: nothing to recover
I0120 20:51:26.942751 26165 master.cpp:1757] Recovered 0 agents from the registry (129B); allowing 10mins for agents to re-register
I0120 20:51:26.943758 26158 containerizer.cpp:220] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
W0120 20:51:26.944063 26158 backend.cpp:76] Failed to create 'aufs' backend: AufsBackend requires root privileges, but is running as user mesos
W0120 20:51:26.944141 26158 backend.cpp:76] Failed to create 'bind' backend: BindBackend requires root privileges
I0120 20:51:26.945397 26158 cluster.cpp:446] Creating default 'local' authorizer
I0120 20:51:26.945864 26163 slave.cpp:209] Mesos agent started on (579)@172.17.0.3:38943
I0120 20:51:26.945879 26163 slave.cpp:210] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="<a href='https://registry-1.docker.io'>https://registry-1.docker.io</a>" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_credentials="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/http_credentials" --http_heartbeat_interval="30secs" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="posix" --launcher_dir="/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --runtime_dir="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu"
I0120 20:51:26.946244 26163 credentials.hpp:86] Loading credential for authentication from '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/credential'
I0120 20:51:26.946334 26163 slave.cpp:352] Agent using credential for: test-principal
I0120 20:51:26.946365 26158 sched.cpp:232] Version: 1.2.0
I0120 20:51:26.946367 26163 credentials.hpp:37] Loading credentials for authentication from '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/http_credentials'
I0120 20:51:26.946568 26163 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readonly'
I0120 20:51:26.946611 26163 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readwrite'
I0120 20:51:26.946804 26168 sched.cpp:336] New master detected at master@172.17.0.3:38943
I0120 20:51:26.946853 26168 sched.cpp:407] Authenticating with master master@172.17.0.3:38943
I0120 20:51:26.946868 26168 sched.cpp:414] Using default CRAM-MD5 authenticatee
I0120 20:51:26.946948 26170 authenticatee.cpp:121] Creating new client SASL connection
I0120 20:51:26.947038 26163 slave.cpp:539] Agent resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0120 20:51:26.947078 26163 slave.cpp:547] Agent attributes: [  ]
I0120 20:51:26.947083 26163 slave.cpp:552] Agent hostname: 9a52de5d9dcd
I0120 20:51:26.947129 26170 master.cpp:6842] Authenticating scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943
I0120 20:51:26.947187 26171 status_update_manager.cpp:177] Pausing sending status updates
I0120 20:51:26.947300 26166 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(1119)@172.17.0.3:38943
I0120 20:51:26.947371 26168 authenticator.cpp:98] Creating new server SASL connection
I0120 20:51:26.947409 26170 state.cpp:60] Recovering state from '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/meta'
I0120 20:51:26.947633 26173 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0120 20:51:26.947662 26173 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0120 20:51:26.947671 26168 status_update_manager.cpp:203] Recovering status update manager
I0120 20:51:26.947749 26167 authenticator.cpp:204] Received SASL authentication start
I0120 20:51:26.947787 26171 containerizer.cpp:599] Recovering containerizer
I0120 20:51:26.947809 26167 authenticator.cpp:326] Authentication requires more steps
I0120 20:51:26.947860 26167 authenticatee.cpp:259] Received SASL authentication step
I0120 20:51:26.947950 26174 authenticator.cpp:232] Received SASL authentication step
I0120 20:51:26.948001 26174 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: '9a52de5d9dcd' server FQDN: '9a52de5d9dcd' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0120 20:51:26.948014 26174 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0120 20:51:26.948031 26174 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0120 20:51:26.948072 26174 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: '9a52de5d9dcd' server FQDN: '9a52de5d9dcd' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0120 20:51:26.948087 26174 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0120 20:51:26.948092 26174 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0120 20:51:26.948106 26174 authenticator.cpp:318] Authentication success
I0120 20:51:26.948158 26167 authenticatee.cpp:299] Authentication success
I0120 20:51:26.948194 26164 master.cpp:6872] Successfully authenticated principal 'test-principal' at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943
I0120 20:51:26.948288 26161 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(1119)@172.17.0.3:38943
I0120 20:51:26.948405 26164 sched.cpp:513] Successfully authenticated with master master@172.17.0.3:38943
I0120 20:51:26.948421 26164 sched.cpp:836] Sending SUBSCRIBE call to master@172.17.0.3:38943
I0120 20:51:26.948474 26164 sched.cpp:869] Will retry registration in 683.987022ms if necessary
I0120 20:51:26.948523 26161 master.cpp:2707] Received SUBSCRIBE call for framework 'default' at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943
I0120 20:51:26.948544 26161 master.cpp:2155] Authorizing framework principal 'test-principal' to receive offers for role '*'
I0120 20:51:26.948689 26172 master.cpp:2783] Subscribing framework default with checkpointing disabled and capabilities [  ]
I0120 20:51:26.948720 26167 provisioner.cpp:251] Provisioner recovery complete
I0120 20:51:26.948846 26172 sched.cpp:759] Framework registered with 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.948894 26172 sched.cpp:773] Scheduler::registered took 27801ns
I0120 20:51:26.949010 26169 slave.cpp:5422] Finished recovery
I0120 20:51:26.949050 26163 hierarchical.cpp:271] Added framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.949074 26163 hierarchical.cpp:1677] No allocations performed
I0120 20:51:26.949084 26163 hierarchical.cpp:1772] No inverse offers to send out!
I0120 20:51:26.949098 26163 hierarchical.cpp:1279] Performed allocation for 0 agents in 33483ns
I0120 20:51:26.949394 26169 slave.cpp:5596] Querying resource estimator for oversubscribable resources
I0120 20:51:26.949549 26168 slave.cpp:929] New master detected at master@172.17.0.3:38943
I0120 20:51:26.949561 26171 status_update_manager.cpp:177] Pausing sending status updates
I0120 20:51:26.949607 26168 slave.cpp:964] Detecting new master
I0120 20:51:26.949692 26168 slave.cpp:5610] Received oversubscribable resources {} from the resource estimator
I0120 20:51:26.953790 26163 slave.cpp:991] Authenticating with master master@172.17.0.3:38943
I0120 20:51:26.953835 26163 slave.cpp:1002] Using default CRAM-MD5 authenticatee
I0120 20:51:26.953943 26167 authenticatee.cpp:121] Creating new client SASL connection
I0120 20:51:26.954089 26163 master.cpp:6842] Authenticating slave(579)@172.17.0.3:38943
I0120 20:51:26.954202 26160 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(1120)@172.17.0.3:38943
I0120 20:51:26.954330 26167 authenticator.cpp:98] Creating new server SASL connection
I0120 20:51:26.954530 26170 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0120 20:51:26.954546 26170 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0120 20:51:26.954576 26170 authenticator.cpp:204] Received SASL authentication start
I0120 20:51:26.954615 26170 authenticator.cpp:326] Authentication requires more steps
I0120 20:51:26.954668 26164 authenticatee.cpp:259] Received SASL authentication step
I0120 20:51:26.954777 26165 authenticator.cpp:232] Received SASL authentication step
I0120 20:51:26.954812 26165 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: '9a52de5d9dcd' server FQDN: '9a52de5d9dcd' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0120 20:51:26.954830 26165 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0120 20:51:26.954854 26165 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0120 20:51:26.954865 26165 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: '9a52de5d9dcd' server FQDN: '9a52de5d9dcd' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0120 20:51:26.954874 26165 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0120 20:51:26.954883 26165 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0120 20:51:26.954893 26165 authenticator.cpp:318] Authentication success
I0120 20:51:26.954946 26170 authenticatee.cpp:299] Authentication success
I0120 20:51:26.954972 26161 master.cpp:6872] Successfully authenticated principal 'test-principal' at slave(579)@172.17.0.3:38943
I0120 20:51:26.954990 26169 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(1120)@172.17.0.3:38943
I0120 20:51:26.955232 26172 slave.cpp:1086] Successfully authenticated with master master@172.17.0.3:38943
I0120 20:51:26.955298 26172 slave.cpp:1508] Will retry registration in 14.969835ms if necessary
I0120 20:51:26.955387 26168 master.cpp:5232] Registering agent at slave(579)@172.17.0.3:38943 (9a52de5d9dcd) with id 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0
I0120 20:51:26.955548 26165 registrar.cpp:461] Applied 1 operations in 19695ns; attempting to update the registry
I0120 20:51:26.955932 26169 registrar.cpp:506] Successfully updated the registry in 356096ns
I0120 20:51:26.956336 26159 slave.cpp:4286] Received ping from slave-observer(524)@172.17.0.3:38943
I0120 20:51:26.956328 26166 master.cpp:5303] Registered agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0120 20:51:26.956439 26159 slave.cpp:1132] Registered with master master@172.17.0.3:38943; given agent ID 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0
I0120 20:51:26.956434 26164 hierarchical.cpp:478] Added agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 (9a52de5d9dcd) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: {})
I0120 20:51:26.956465 26159 fetcher.cpp:90] Clearing fetcher cache
I0120 20:51:26.956569 26161 status_update_manager.cpp:184] Resuming sending status updates
I0120 20:51:26.956748 26164 hierarchical.cpp:1772] No inverse offers to send out!
I0120 20:51:26.956774 26164 hierarchical.cpp:1302] Performed allocation for agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 in 299624ns
I0120 20:51:26.956792 26159 slave.cpp:1160] Checkpointing SlaveInfo to '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/meta/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/slave.info'
I0120 20:51:26.956919 26165 master.cpp:6671] Sending 1 offers to framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 (default) at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943
I0120 20:51:26.957151 26159 slave.cpp:1198] Forwarding total oversubscribed resources {}
I0120 20:51:26.957171 26170 sched.cpp:933] Scheduler::resourceOffers took 57116ns
I0120 20:51:26.957227 26159 master.cpp:5710] Received update of agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd) with total oversubscribed resources {}
I0120 20:51:26.957355 26161 hierarchical.cpp:548] Agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 (9a52de5d9dcd) updated with oversubscribed resources {} (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I0120 20:51:26.957440 26161 hierarchical.cpp:1677] No allocations performed
I0120 20:51:26.957458 26161 hierarchical.cpp:1772] No inverse offers to send out!
I0120 20:51:26.957480 26161 hierarchical.cpp:1302] Performed allocation for agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 in 61947ns
I0120 20:51:26.957870 26174 master.cpp:3661] Processing ACCEPT call for offers: [ 2b435d8e-3792-458e-96ff-0ecff4b4aa54-O0 ] on agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd) for framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 (default) at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943
I0120 20:51:26.957916 26174 master.cpp:3249] Authorizing framework principal 'test-principal' to launch task 7e4c1b3d-2297-4d19-8cba-f55ae009201d
I0120 20:51:26.958585 26170 master.cpp:8584] Adding task 7e4c1b3d-2297-4d19-8cba-f55ae009201d with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd)
I0120 20:51:26.958679 26170 master.cpp:4311] Launching task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 (default) at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd)
I0120 20:51:26.959043 26162 slave.cpp:1576] Got assigned task '7e4c1b3d-2297-4d19-8cba-f55ae009201d' for framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.959344 26162 slave.cpp:1736] Launching task '7e4c1b3d-2297-4d19-8cba-f55ae009201d' for framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.959897 26162 paths.cpp:547] Trying to chown '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267' to user 'mesos'
I0120 20:51:26.960147 26162 slave.cpp:6331] Launching executor '7e4c1b3d-2297-4d19-8cba-f55ae009201d' of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267'
I0120 20:51:26.960433 26173 containerizer.cpp:992] Starting container 51822148-9c1d-40e6-b255-d788c3dca267 for executor '7e4c1b3d-2297-4d19-8cba-f55ae009201d' of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.960469 26162 slave.cpp:2058] Queued task '7e4c1b3d-2297-4d19-8cba-f55ae009201d' for executor '7e4c1b3d-2297-4d19-8cba-f55ae009201d' of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:26.960530 26162 slave.cpp:882] Successfully attached file '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267'
I0120 20:51:26.962117 26167 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"arguments":["mesos-executor","--launcher_dir=\/mesos\/build\/src"],"shell":false,"value":"\/mesos\/build\/src\/mesos-executor"},"environment":{"variables":[{"name":"LIBPROCESS_PORT","value":"0"},{"name":"MESOS_AGENT_ENDPOINT","value":"172.17.0.3:38943"},{"name":"MESOS_CHECKPOINT","value":"0"},{"name":"MESOS_DIRECTORY","value":"\/tmp\/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu\/slaves\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0\/frameworks\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000\/executors\/7e4c1b3d-2297-4d19-8cba-f55ae009201d\/runs\/51822148-9c1d-40e6-b255-d788c3dca267"},{"name":"MESOS_EXECUTOR_ID","value":"7e4c1b3d-2297-4d19-8cba-f55ae009201d"},{"name":"MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD","value":"5secs"},{"name":"MESOS_FRAMEWORK_ID","value":"2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000"},{"name":"MESOS_HTTP_COMMAND_EXECUTOR","value":"0"},{"name":"MESOS_SLAVE_ID","value":"2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0"},{"name":"MESOS_SLAVE_PID","value":"slave(579)@172.17.0.3:38943"},{"name":"MESOS_SANDBOX","value":"\/tmp\/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu\/slaves\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0\/frameworks\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000\/executors\/7e4c1b3d-2297-4d19-8cba-f55ae009201d\/runs\/51822148-9c1d-40e6-b255-d788c3dca267"}]},"err":{"path":"\/tmp\/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu\/slaves\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0\/frameworks\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000\/executors\/7e4c1b3d-2297-4d19-8cba-f55ae009201d\/runs\/51822148-9c1d-40e6-b255-d788c3dca267\/stderr","type":"PATH"},"out":{"path":"\/tmp\/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu\/slaves\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0\/frameworks\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000\/executors\/7e4c1b3d-2297-4d19-8cba-f55ae009201d\/runs\/51822148-9c1d-40e6-b255-d788c3dca267\/stdout","type":"PATH"},"user":"mesos","working_directory":"\/tmp\/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu\/slaves\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0\/frameworks\/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000\/executors\/7e4c1b3d-2297-4d19-8cba-f55ae009201d\/runs\/51822148-9c1d-40e6-b255-d788c3dca267"}" --pipe_read="9" --pipe_write="10" --runtime_directory="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/containers/51822148-9c1d-40e6-b255-d788c3dca267" --unshare_namespace_mnt="false"'
I0120 20:51:26.964438 26167 launcher.cpp:135] Forked child with pid '30032' for container '51822148-9c1d-40e6-b255-d788c3dca267'
I0120 20:51:26.966521 26170 fetcher.cpp:349] Starting to fetch URIs for container: 51822148-9c1d-40e6-b255-d788c3dca267, directory: /tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267
I0120 20:51:27.064268 26172 slave.cpp:3325] Got registration for executor '7e4c1b3d-2297-4d19-8cba-f55ae009201d' of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 from executor(1)@172.17.0.3:45428
I0120 20:51:27.065098 26159 slave.cpp:2271] Sending queued task '7e4c1b3d-2297-4d19-8cba-f55ae009201d' to executor '7e4c1b3d-2297-4d19-8cba-f55ae009201d' of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 at executor(1)@172.17.0.3:45428
I0120 20:51:27.071434 26173 slave.cpp:3756] Handling status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 from executor(1)@172.17.0.3:45428
I0120 20:51:27.071974 26167 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.072001 26167 status_update_manager.cpp:500] Creating StatusUpdate stream for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.072142 26167 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 to the agent
I0120 20:51:27.072268 26160 slave.cpp:4196] Forwarding the update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 to master@172.17.0.3:38943
I0120 20:51:27.072345 26160 slave.cpp:4090] Status update manager successfully handled status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.072368 26160 slave.cpp:4106] Sending acknowledgement for status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 to executor(1)@172.17.0.3:45428
I0120 20:51:27.072432 26168 master.cpp:5855] Status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 from agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0 at slave(579)@172.17.0.3:38943 (9a52de5d9dcd)
I0120 20:51:27.072468 26168 master.cpp:5917] Forwarding status update TASK_RUNNING (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.072549 26168 master.cpp:7956] Updating the state of task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING)
I0120 20:51:27.072660 26169 sched.cpp:1041] Scheduler::statusUpdate took 57036ns
I0120 20:51:27.072844 26168 master.cpp:4948] Processing ACKNOWLEDGE call fc8ced65-824c-40e5-971c-c5bb8333c49a for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000 (default) at scheduler-4d2c5f17-3a15-4724-bffb-0599159fc79f@172.17.0.3:38943 on agent 2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0
I0120 20:51:27.073006 26169 status_update_manager.cpp:395] Received status update acknowledgement (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.073171 26160 slave.cpp:3045] Status update manager successfully handled status update acknowledgement (UUID: fc8ced65-824c-40e5-971c-c5bb8333c49a) for task 7e4c1b3d-2297-4d19-8cba-f55ae009201d of framework 2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000
I0120 20:51:27.073739 26161 process.cpp:3697] Handling HTTP event for process 'slave(579)' with path: '/slave(579)/api/v1'
I0120 20:51:27.074090 26172 http.cpp:303] HTTP POST for /slave(579)/api/v1 from 172.17.0.3:60484
I0120 20:51:27.074259 26172 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION
I0120 20:51:27.074430 26172 containerizer.cpp:1808] Starting nested container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e
I0120 20:51:27.074621 26172 containerizer.cpp:1832] Trying to chown '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267/containers/e7902473-84f1-4c46-a2fb-dc3b91133c0e' to user 'mesos'
I0120 20:51:27.075654 26166 switchboard.cpp:571] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-646d928f-4493-4aa3-a337-7b12fa3c5ecf" --stderr_from_fd="17" --stderr_to_fd="2" --stdin_to_fd="14" --stdout_from_fd="15" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e
I0120 20:51:27.077797 26166 switchboard.cpp:601] Created I/O switchboard server (pid: 30052) listening on socket file '/tmp/mesos-io-switchboard-646d928f-4493-4aa3-a337-7b12fa3c5ecf' for container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e
I0120 20:51:27.078768 26160 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"printf output &amp;&amp; printf error 1&gt;&amp;2"},"environment":{},"err":{"fd":18,"type":"FD"},"in":{"fd":13,"type":"FD"},"out":{"fd":16,"type":"FD"},"user":"mesos"}" --pipe_read="14" --pipe_write="15" --runtime_directory="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_tIgxW4/containers/51822148-9c1d-40e6-b255-d788c3dca267/containers/e7902473-84f1-4c46-a2fb-dc3b91133c0e" --unshare_namespace_mnt="false"'
I0120 20:51:27.080716 26160 launcher.cpp:135] Forked child with pid '30053' for container '51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e'
I0120 20:51:27.082023 26169 fetcher.cpp:349] Starting to fetch URIs for container: 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e, directory: /tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_1_fDrhqu/slaves/2b435d8e-3792-458e-96ff-0ecff4b4aa54-S0/frameworks/2b435d8e-3792-458e-96ff-0ecff4b4aa54-0000/executors/7e4c1b3d-2297-4d19-8cba-f55ae009201d/runs/51822148-9c1d-40e6-b255-d788c3dca267/containers/e7902473-84f1-4c46-a2fb-dc3b91133c0e
I0120 20:51:27.128425 26170 containerizer.cpp:2482] Container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e has exited
I0120 20:51:27.128443 26170 containerizer.cpp:2119] Destroying container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e in RUNNING state
I0120 20:51:27.128509 26170 launcher.cpp:151] Asked to destroy container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e
I0120 20:51:27.134526 26163 process.cpp:3754] Failed to process request for '/slave(579)/api/v1': Container has or is being destroyed
I0120 20:51:27.134551 26169 process.cpp:1456] Returning '500 Internal Server Error' for '/slave(579)/api/v1' (Container has or is being destroyed)
W0120 20:51:27.134552 26174 http.cpp:2673] Failed to attach to nested container 51822148-9c1d-40e6-b255-d788c3dca267.e7902473-84f1-4c46-a2fb-dc3b91133c0e: Container has or is being destroyed
/mesos/src/tests/api_tests.cpp:4098: Failure
Value of: (response).get().status
  Actual: "500 Internal Server Error"
Expected: http::OK().status
Which is: "200 OK"
/mesos/src/tests/api_tests.cpp:4099: Failure
Value of: response-&gt;headers.at("Content-Type")
  Actual: "text/plain; charset=utf-8"
Expected: stringify(contentType)
Which is: "application/json"
{code}
https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/3170/changes

> AgentAPITest.LaunchNestedContainerSession is flaky
> --------------------------------------------------
>
>                 Key: MESOS-6948
>                 URL: https://issues.apache.org/jira/browse/MESOS-6948
>             Project: Mesos
>          Issue Type: Bug
>          Components: tests
>         Environment: CentOS 7 VM, libevent and SSL enabled
>            Reporter: Greg Mann
>            Assignee: Kevin Klues
>            Priority: Critical
>              Labels: debugging, tests
>         Attachments: AgentAPITest.LaunchNestedContainerSession.txt
>
>
> This was observed in a CentOS 7 VM, with libevent and SSL enabled:
> {code}
> I0118 22:17:23.528846  2887 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION
> I0118 22:17:23.530452  2887 containerizer.cpp:1807] Starting nested container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e
> I0118 22:17:23.532265  2887 containerizer.cpp:1831] Trying to chown '/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_0_ykIax9/slaves/707fd1a2-1a93-4e9f-a9b2-5453a207b4c5-S0/frameworks/707fd1a2-1a93-4e9f-a9b2-5453a207b4c5-0000/executors/14a26e2a-58b7-4166-909c-c90787d84fcb/runs/492a5d0a-0060-416c-ad80-dd0441f558dc/containers/62c170bb-7298-4209-b797-80d7ca73353e' to user 'vagrant'
> I0118 22:17:23.535213  2887 switchboard.cpp:570] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-5a08fbd5-0d70-411e-8389-ac115a5f6430" --stderr_from_fd="15" --stderr_to_fd="2" --stdin_to_fd="12" --stdout_from_fd="13" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e
> I0118 22:17:23.537210  2887 switchboard.cpp:600] Created I/O switchboard server (pid: 3335) listening on socket file '/tmp/mesos-io-switchboard-5a08fbd5-0d70-411e-8389-ac115a5f6430' for container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e
> I0118 22:17:23.543665  2887 containerizer.cpp:1540] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"printf output && printf error 1>&2"},"environment":{},"err":{"fd":16,"type":"FD"},"in":{"fd":11,"type":"FD"},"out":{"fd":14,"type":"FD"},"user":"vagrant"}" --pipe_read="12" --pipe_write="13" --runtime_directory="/tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_0_QVZGrY/containers/492a5d0a-0060-416c-ad80-dd0441f558dc/containers/62c170bb-7298-4209-b797-80d7ca73353e" --unshare_namespace_mnt="false"'
> I0118 22:17:23.556032  2887 launcher.cpp:133] Forked child with pid '3337' for container '492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e'
> I0118 22:17:23.563900  2887 fetcher.cpp:349] Starting to fetch URIs for container: 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e, directory: /tmp/ContentType_AgentAPITest_LaunchNestedContainerSession_0_ykIax9/slaves/707fd1a2-1a93-4e9f-a9b2-5453a207b4c5-S0/frameworks/707fd1a2-1a93-4e9f-a9b2-5453a207b4c5-0000/executors/14a26e2a-58b7-4166-909c-c90787d84fcb/runs/492a5d0a-0060-416c-ad80-dd0441f558dc/containers/62c170bb-7298-4209-b797-80d7ca73353e
> I0118 22:17:23.962441  2887 containerizer.cpp:2481] Container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e has exited
> I0118 22:17:23.962484  2887 containerizer.cpp:2118] Destroying container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e in RUNNING state
> I0118 22:17:23.962715  2887 launcher.cpp:149] Asked to destroy container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e
> I0118 22:17:23.977562  2887 process.cpp:3733] Failed to process request for '/slave(69)/api/v1': Container has or is being destroyed
> W0118 22:17:23.978216  2887 http.cpp:2734] Failed to attach to nested container 492a5d0a-0060-416c-ad80-dd0441f558dc.62c170bb-7298-4209-b797-80d7ca73353e: Container has or is being destroyed
> I0118 22:17:23.978330  2887 process.cpp:1435] Returning '500 Internal Server Error' for '/slave(69)/api/v1' (Container has or is being destroyed)
> ../../src/tests/api_tests.cpp:3960: Failure
> Value of: (response).get().status
>   Actual: "500 Internal Server Error"
> Expected: http::OK().status
> Which is: "200 OK"
> {code}
> Find attached the full log from a failed run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)