You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Jan Schlicht (JIRA)" <ji...@apache.org> on 2015/12/01 17:15:11 UTC

[jira] [Comment Edited] (MESOS-4032) SlaveRecoveryTest/0.Reboot fails under CentOS 7.1 with libevent & SSL enabled.

    [ https://issues.apache.org/jira/browse/MESOS-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033936#comment-15033936 ] 

Jan Schlicht edited comment on MESOS-4032 at 12/1/15 4:14 PM:
--------------------------------------------------------------

The tests work fine if Mesos is compiled with libev, without SSL.
Running the test in isolation also fails.

Verbose output:
{noformat}
[ RUN      ] SlaveRecoveryTest/0.Reboot
I1201 16:13:43.764530 30105 cgroups.cpp:2429] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86
I1201 16:13:43.955772 30100 cgroups.cpp:1411] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86 after 190.95296ms
I1201 16:13:44.151808 30106 cgroups.cpp:2447] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86
I1201 16:13:44.338899 30103 cgroups.cpp:1440] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86 after 186.987008ms
I1201 16:13:46.429718 30085 leveldb.cpp:176] Opened db in 6.794189ms
I1201 16:13:46.431185 30085 leveldb.cpp:183] Compacted db in 1.403926ms
I1201 16:13:46.431273 30085 leveldb.cpp:198] Created db iterator in 55789ns
I1201 16:13:46.431289 30085 leveldb.cpp:204] Seeked to beginning of db in 3775ns
I1201 16:13:46.431293 30085 leveldb.cpp:273] Iterated through 0 keys in the db in 120ns
I1201 16:13:46.431409 30085 replica.cpp:780] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I1201 16:13:46.432781 30104 recover.cpp:449] Starting replica recovery
I1201 16:13:46.433365 30104 recover.cpp:475] Replica is in EMPTY status
I1201 16:13:46.438645 30104 replica.cpp:676] Replica in EMPTY status received a broadcasted recover request from (9)@127.0.0.1:52014
I1201 16:13:46.439353 30099 master.cpp:367] Master 0c54b5bb-d0f8-4c94-8f2a-c49672419e62 (centos71) started on 127.0.0.1:52014
I1201 16:13:46.439602 30100 recover.cpp:195] Received a recover response from a replica in EMPTY status
I1201 16:13:46.439393 30099 master.cpp:369] Flags at startup: --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/qZBjUp/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" --registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/qZBjUp/master" --zk_session_timeout="10secs"
W1201 16:13:46.439997 30099 master.cpp:372]
**************************************************
Master bound to loopback interface! Cannot communicate with remote schedulers or slaves. You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:46.440037 30099 master.cpp:414] Master only allowing authenticated frameworks to register
I1201 16:13:46.440042 30099 master.cpp:419] Master only allowing authenticated slaves to register
I1201 16:13:46.440047 30099 credentials.hpp:37] Loading credentials for authentication from '/tmp/qZBjUp/credentials'
I1201 16:13:46.440315 30106 recover.cpp:566] Updating replica status to STARTING
I1201 16:13:46.440580 30099 master.cpp:458] Using default 'crammd5' authenticator
I1201 16:13:46.440743 30099 authenticator.cpp:520] Initializing server SASL
I1201 16:13:46.442067 30099 master.cpp:495] Authorization enabled
I1201 16:13:46.447201 30099 master.cpp:1606] The newly elected leader is master@127.0.0.1:52014 with id 0c54b5bb-d0f8-4c94-8f2a-c49672419e62
I1201 16:13:46.447230 30099 master.cpp:1619] Elected as the leading master!
I1201 16:13:46.447255 30099 master.cpp:1379] Recovering from registrar
I1201 16:13:46.447590 30099 registrar.cpp:309] Recovering registrar
I1201 16:13:46.451647 30100 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 10.746719ms
I1201 16:13:46.451686 30100 replica.cpp:323] Persisted replica status to STARTING
I1201 16:13:46.451942 30106 recover.cpp:475] Replica is in STARTING status
I1201 16:13:46.452819 30100 replica.cpp:676] Replica in STARTING status received a broadcasted recover request from (10)@127.0.0.1:52014
I1201 16:13:46.453064 30105 recover.cpp:195] Received a recover response from a replica in STARTING status
I1201 16:13:46.453727 30104 recover.cpp:566] Updating replica status to VOTING
I1201 16:13:46.454529 30105 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 720044ns
I1201 16:13:46.454548 30105 replica.cpp:323] Persisted replica status to VOTING
I1201 16:13:46.454638 30105 recover.cpp:580] Successfully joined the Paxos group
I1201 16:13:46.454787 30105 recover.cpp:464] Recover process terminated
I1201 16:13:46.455405 30105 log.cpp:661] Attempting to start the writer
I1201 16:13:46.457233 30106 replica.cpp:496] Replica received implicit promise request from (11)@127.0.0.1:52014 with proposal 1
I1201 16:13:46.458009 30106 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 744722ns
I1201 16:13:46.458032 30106 replica.cpp:345] Persisted promised to 1
I1201 16:13:46.458905 30099 coordinator.cpp:240] Coordinator attempting to fill missing positions
I1201 16:13:46.460537 30104 replica.cpp:391] Replica received explicit promise request from (12)@127.0.0.1:52014 for position 0 with proposal 2
I1201 16:13:46.461196 30104 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 615463ns
I1201 16:13:46.461220 30104 replica.cpp:715] Persisted action at 0
I1201 16:13:46.462738 30103 replica.cpp:540] Replica received write request for position 0 from (13)@127.0.0.1:52014
I1201 16:13:46.462808 30103 leveldb.cpp:438] Reading position from leveldb took 44123ns
I1201 16:13:46.463500 30103 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 634612ns
I1201 16:13:46.463520 30103 replica.cpp:715] Persisted action at 0
I1201 16:13:46.464489 30106 replica.cpp:694] Replica received learned notice for position 0 from @0.0.0.0:0
I1201 16:13:46.465271 30106 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 750397ns
I1201 16:13:46.465294 30106 replica.cpp:715] Persisted action at 0
I1201 16:13:46.465309 30106 replica.cpp:700] Replica learned NOP action at position 0
I1201 16:13:46.466377 30102 log.cpp:677] Writer started with ending position 0
I1201 16:13:46.468979 30104 leveldb.cpp:438] Reading position from leveldb took 69844ns
I1201 16:13:46.471437 30100 registrar.cpp:342] Successfully fetched the registry (0B) in 23.800064ms
I1201 16:13:46.471635 30100 registrar.cpp:441] Applied 1 operations in 63798ns; attempting to update the 'registry'
I1201 16:13:46.474053 30100 log.cpp:685] Attempting to append 158 bytes to the log
I1201 16:13:46.474249 30100 coordinator.cpp:350] Coordinator attempting to write APPEND action at position 1
I1201 16:13:46.475517 30100 replica.cpp:540] Replica received write request for position 1 from (14)@127.0.0.1:52014
I1201 16:13:46.476613 30100 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 1.056031ms
I1201 16:13:46.476636 30100 replica.cpp:715] Persisted action at 1
I1201 16:13:46.477459 30099 replica.cpp:694] Replica received learned notice for position 1 from @0.0.0.0:0
I1201 16:13:46.478224 30099 leveldb.cpp:343] Persisting action (179 bytes) to leveldb took 731548ns
I1201 16:13:46.478245 30099 replica.cpp:715] Persisted action at 1
I1201 16:13:46.478258 30099 replica.cpp:700] Replica learned APPEND action at position 1
I1201 16:13:46.479604 30103 registrar.cpp:486] Successfully updated the 'registry' in 7.875072ms
I1201 16:13:46.479717 30103 registrar.cpp:372] Successfully recovered registrar
I1201 16:13:46.479877 30101 log.cpp:704] Attempting to truncate the log to 1
I1201 16:13:46.480336 30099 coordinator.cpp:350] Coordinator attempting to write TRUNCATE action at position 2
I1201 16:13:46.480451 30100 master.cpp:1416] Recovered 0 slaves from the Registry (120B) ; allowing 10mins for slaves to re-register
I1201 16:13:46.481528 30105 replica.cpp:540] Replica received write request for position 2 from (15)@127.0.0.1:52014
I1201 16:13:46.482573 30105 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 1.016896ms
I1201 16:13:46.482623 30105 replica.cpp:715] Persisted action at 2
I1201 16:13:46.483778 30104 replica.cpp:694] Replica received learned notice for position 2 from @0.0.0.0:0
I1201 16:13:46.485167 30104 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 1.362734ms
I1201 16:13:46.485237 30104 leveldb.cpp:401] Deleting ~1 keys from leveldb took 41835ns
I1201 16:13:46.485252 30104 replica.cpp:715] Persisted action at 2
I1201 16:13:46.485272 30104 replica.cpp:700] Replica learned TRUNCATE action at position 2
I1201 16:13:46.494474 30085 containerizer.cpp:142] Using isolation: cgroups/cpu,cgroups/mem,filesystem/posix
I1201 16:13:49.671596 30085 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 16:13:49.676256 30085 systemd.cpp:128] systemd version `208` detected
W1201 16:13:49.676290 30085 systemd.cpp:136] Required functionality `Delegate` was introduced in Version `218`. Your system may not function properly; however since some distributions have patched systemd packages, your system may still be functional. This is why we keep running. See MESOS-3352 for more information
I1201 16:13:49.682055 30085 systemd.cpp:210] Started systemd slice `mesos_executors.slice`
I1201 16:13:49.752518 30106 slave.cpp:191] Slave started on 1)@127.0.0.1:52014
I1201 16:13:49.752552 30106 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io" --docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io" --docker_registry_port="443" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem" --launcher_dir="/home/vagrant/mesos/build-ssl/src" --logbufsecs="0" --logging_level="INFO" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5"
W1201 16:13:49.752852 30106 slave.cpp:195]
**************************************************
Slave bound to loopback interface! Cannot communicate with remote master(s). You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:49.752867 30106 slave.cpp:212] Moving slave process into its own cgroup for subsystem: memory
W1201 16:13:49.753592 30085 sched.cpp:1542]
**************************************************
Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address.
**************************************************
I1201 16:13:49.755309 30085 sched.cpp:166] Version: 0.26.0
I1201 16:13:49.756275 30105 sched.cpp:264] New master detected at master@127.0.0.1:52014
I1201 16:13:49.756366 30105 sched.cpp:320] Authenticating with master master@127.0.0.1:52014
I1201 16:13:49.756381 30105 sched.cpp:327] Using default CRAM-MD5 authenticatee
I1201 16:13:49.756722 30101 authenticatee.cpp:99] Initializing client SASL
I1201 16:13:49.756868 30101 authenticatee.cpp:123] Creating new client SASL connection
I1201 16:13:49.757745 30105 master.cpp:5150] Authenticating scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:49.758275 30105 authenticator.cpp:100] Creating new server SASL connection
I1201 16:13:49.758677 30101 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 16:13:49.758704 30101 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 16:13:49.758795 30101 authenticator.cpp:205] Received SASL authentication start
I1201 16:13:49.758844 30101 authenticator.cpp:327] Authentication requires more steps
I1201 16:13:49.758931 30101 authenticatee.cpp:260] Received SASL authentication step
I1201 16:13:49.759091 30100 authenticator.cpp:233] Received SASL authentication step
I1201 16:13:49.759285 30100 authenticator.cpp:319] Authentication success
I1201 16:13:49.759389 30105 authenticatee.cpp:300] Authentication success
I1201 16:13:49.759475 30105 master.cpp:5180] Successfully authenticated principal 'test-principal' at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:49.759702 30101 sched.cpp:409] Successfully authenticated with master master@127.0.0.1:52014
I1201 16:13:49.760151 30104 master.cpp:2176] Received SUBSCRIBE call for framework 'default' at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:49.760252 30104 master.cpp:1645] Authorizing framework principal 'test-principal' to receive offers for role '*'
I1201 16:13:49.760586 30103 master.cpp:2247] Subscribing framework default with checkpointing enabled and capabilities [  ]
I1201 16:13:49.761400 30104 hierarchical.cpp:195] Added framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:49.761494 30103 sched.cpp:643] Framework registered with 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:50.846541 30106 slave.cpp:212] Moving slave process into its own cgroup for subsystem: cpuacct
I1201 16:13:51.606369 30106 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential'
I1201 16:13:51.606848 30106 slave.cpp:322] Slave using credential for: test-principal
I1201 16:13:51.607802 30106 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:51.607900 30106 slave.cpp:400] Slave attributes: [  ]
I1201 16:13:51.607913 30106 slave.cpp:405] Slave hostname: centos71
I1201 16:13:51.607918 30106 slave.cpp:410] Slave checkpoint: true
I1201 16:13:51.609427 30100 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta'
I1201 16:13:51.609800 30100 status_update_manager.cpp:202] Recovering status update manager
I1201 16:13:51.610244 30104 containerizer.cpp:384] Recovering containerizer
I1201 16:13:51.805718 30102 slave.cpp:4230] Finished recovery
I1201 16:13:51.806789 30105 slave.cpp:729] New master detected at master@127.0.0.1:52014
I1201 16:13:51.806830 30102 status_update_manager.cpp:176] Pausing sending status updates
I1201 16:13:51.806860 30105 slave.cpp:792] Authenticating with master master@127.0.0.1:52014
I1201 16:13:51.806871 30105 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 16:13:51.807067 30099 authenticatee.cpp:123] Creating new client SASL connection
I1201 16:13:51.807073 30105 slave.cpp:765] Detecting new master
I1201 16:13:51.809116 30102 master.cpp:5150] Authenticating slave(1)@127.0.0.1:52014
I1201 16:13:51.809415 30105 authenticator.cpp:100] Creating new server SASL connection
I1201 16:13:51.810292 30099 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 16:13:51.810312 30099 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 16:13:51.810420 30102 authenticator.cpp:205] Received SASL authentication start
I1201 16:13:51.810454 30102 authenticator.cpp:327] Authentication requires more steps
I1201 16:13:51.810523 30102 authenticatee.cpp:260] Received SASL authentication step
I1201 16:13:51.810619 30105 authenticator.cpp:233] Received SASL authentication step
I1201 16:13:51.810670 30105 authenticator.cpp:319] Authentication success
I1201 16:13:51.810789 30102 authenticatee.cpp:300] Authentication success
I1201 16:13:51.810870 30103 master.cpp:5180] Successfully authenticated principal 'test-principal' at slave(1)@127.0.0.1:52014
I1201 16:13:51.811317 30102 slave.cpp:860] Successfully authenticated with master master@127.0.0.1:52014
I1201 16:13:51.811728 30105 master.cpp:3859] Registering slave at slave(1)@127.0.0.1:52014 (centos71) with id 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:51.812587 30099 registrar.cpp:441] Applied 1 operations in 67177ns; attempting to update the 'registry'
I1201 16:13:51.814066 30101 log.cpp:685] Attempting to append 324 bytes to the log
I1201 16:13:51.814266 30103 coordinator.cpp:350] Coordinator attempting to write APPEND action at position 3
I1201 16:13:51.815278 30104 replica.cpp:540] Replica received write request for position 3 from (34)@127.0.0.1:52014
I1201 16:13:51.817994 30104 leveldb.cpp:343] Persisting action (343 bytes) to leveldb took 2.682526ms
I1201 16:13:51.818027 30104 replica.cpp:715] Persisted action at 3
I1201 16:13:51.818933 30106 replica.cpp:694] Replica received learned notice for position 3 from @0.0.0.0:0
I1201 16:13:51.819807 30106 leveldb.cpp:343] Persisting action (345 bytes) to leveldb took 845203ns
I1201 16:13:51.819829 30106 replica.cpp:715] Persisted action at 3
I1201 16:13:51.819845 30106 replica.cpp:700] Replica learned APPEND action at position 3
I1201 16:13:51.820930 30102 registrar.cpp:486] Successfully updated the 'registry' in 8.269056ms
I1201 16:13:51.821074 30101 log.cpp:704] Attempting to truncate the log to 3
I1201 16:13:51.821261 30101 coordinator.cpp:350] Coordinator attempting to write TRUNCATE action at position 4
I1201 16:13:51.821889 30101 replica.cpp:540] Replica received write request for position 4 from (35)@127.0.0.1:52014
I1201 16:13:51.821899 30102 master.cpp:3927] Registered slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:51.822307 30102 slave.cpp:904] Registered with master master@127.0.0.1:52014; given slave ID 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:51.822427 30103 hierarchical.cpp:344] Added slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
I1201 16:13:51.822471 30100 status_update_manager.cpp:183] Resuming sending status updates
I1201 16:13:51.822811 30102 slave.cpp:963] Forwarding total oversubscribed resources
I1201 16:13:51.822967 30102 master.cpp:4269] Received update of slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71) with total oversubscribed resources
I1201 16:13:51.823230 30105 hierarchical.cpp:400] Slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I1201 16:13:51.823547 30102 master.cpp:4979] Sending 1 offers to framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:51.824229 30101 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 2.310016ms
I1201 16:13:51.824267 30101 replica.cpp:715] Persisted action at 4
I1201 16:13:51.826622 30099 replica.cpp:694] Replica received learned notice for position 4 from @0.0.0.0:0
I1201 16:13:51.827656 30099 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 1.002838ms
I1201 16:13:51.827719 30099 leveldb.cpp:401] Deleting ~2 keys from leveldb took 34558ns
I1201 16:13:51.827733 30099 replica.cpp:715] Persisted action at 4
I1201 16:13:51.827749 30099 replica.cpp:700] Replica learned TRUNCATE action at position 4
I1201 16:13:51.829840 30101 master.cpp:2915] Processing ACCEPT call for offers: [ 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-O0 ] on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71) for framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:51.829892 30101 master.cpp:2711] Authorizing framework principal 'test-principal' to launch task ccd11e20-3250-4007-b026-e665da093cd1 as user 'root'
I1201 16:13:51.831492 30104 master.hpp:176] Adding task ccd11e20-3250-4007-b026-e665da093cd1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71)
I1201 16:13:51.831588 30104 master.cpp:3245] Launching task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:51.831887 30100 slave.cpp:1294] Got assigned task ccd11e20-3250-4007-b026-e665da093cd1 for framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:51.832974 30100 slave.cpp:1410] Launching task ccd11e20-3250-4007-b026-e665da093cd1 for framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:51.833433 30100 paths.cpp:436] Trying to chown '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9' to user 'root'
I1201 16:13:51.837865 30100 slave.cpp:4999] Launching executor ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9'
I1201 16:13:51.838582 30106 containerizer.cpp:618] Starting container '312026ae-212e-4e4c-bac9-84c8bb2930d9' for executor 'ccd11e20-3250-4007-b026-e665da093cd1' of framework '0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000'
I1201 16:13:51.838891 30100 slave.cpp:1628] Queuing task 'ccd11e20-3250-4007-b026-e665da093cd1' for executor 'ccd11e20-3250-4007-b026-e665da093cd1' of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:52.052389 30106 cpushare.cpp:392] Updated 'cpu.shares' to 2150 (cpus 2.1) for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.276257 30104 mem.cpp:605] Started listening for OOM events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.342397 30104 mem.cpp:725] Started listening on low memory pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.409441 30104 mem.cpp:725] Started listening on medium memory pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.472625 30104 mem.cpp:725] Started listening on critical memory pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.533138 30104 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' to 1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.654757 30104 mem.cpp:391] Updated 'memory.limit_in_bytes' to 1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.811933 30103 linux_launcher.cpp:365] Cloning child process with flags =
I1201 16:13:52.935303 30103 linux_launcher.cpp:422] Assigned child process '30121' to 'mesos_executors.slice'
I1201 16:13:52.935662 30103 containerizer.cpp:851] Checkpointing executor's forked pid 30121 to '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9/pids/forked.pid'
I1201 16:13:53.059762 30121 exec.cpp:136] Version: 0.26.0
I1201 16:13:53.066401 30102 slave.cpp:2405] Got registration for executor 'ccd11e20-3250-4007-b026-e665da093cd1' of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from executor(1)@127.0.0.1:38774
I1201 16:13:53.071707 30148 exec.cpp:210] Executor registered on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
Registered executor on centos71
I1201 16:13:53.136584 30100 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' to 1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:53.142530 30101 cpushare.cpp:392] Updated 'cpu.shares' to 2150 (cpus 2.1) for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:53.202687 30104 slave.cpp:1793] Sending queued task 'ccd11e20-3250-4007-b026-e665da093cd1' to executor 'ccd11e20-3250-4007-b026-e665da093cd1' of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 at executor(1)@127.0.0.1:38774
Starting task ccd11e20-3250-4007-b026-e665da093cd1
sh -c 'sleep 1000'
Forked command at 30157
I1201 16:13:53.208745 30100 slave.cpp:2762] Handling status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from executor(1)@127.0.0.1:38774
I1201 16:13:53.209280 30101 status_update_manager.cpp:322] Received status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.209790 30101 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.214834 30104 slave.cpp:3087] Forwarding the update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 to master@127.0.0.1:52014
I1201 16:13:53.215004 30104 slave.cpp:3011] Sending acknowledgement for status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 to executor(1)@127.0.0.1:38774
I1201 16:13:53.215369 30099 master.cpp:4414] Status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.215407 30099 master.cpp:4462] Forwarding status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.215579 30099 master.cpp:6066] Updating the state of task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING)
I1201 16:13:53.215934 30104 master.cpp:3571] Processing ACKNOWLEDGE call bc5157f4-d1f5-4f97-93df-55a6e2d17330 for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014 on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:53.216327 30104 status_update_manager.cpp:394] Received status update acknowledgement (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.216462 30104 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task ccd11e20-3250-4007-b026-e665da093cd1 of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.220029 30085 slave.cpp:601] Slave terminating
I1201 16:13:53.220582 30103 master.cpp:1083] Slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71) disconnected
I1201 16:13:53.220607 30103 master.cpp:2531] Disconnecting slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.220788 30103 master.cpp:2550] Deactivating slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.221032 30103 hierarchical.cpp:429] Slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 deactivated
I1201 16:13:53.471288 30149 exec.cpp:383] Executor asked to shutdown
Shutting down
Sending SIGTERM to process tree at pid 30157
Killing the following process trees:
[
--- 30157 sleep 1000
]
Command terminated with signal Terminated (pid: 30157)
I1201 16:13:54.534430 30085 containerizer.cpp:142] Using isolation: cgroups/cpu,cgroups/mem,filesystem/posix
I1201 16:13:57.248833 30085 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 16:13:57.254952 30085 systemd.cpp:210] Started systemd slice `mesos_executors.slice`
I1201 16:13:57.322489 30104 slave.cpp:191] Slave started on 2)@127.0.0.1:52014
I1201 16:13:57.322525 30104 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io" --docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io" --docker_registry_port="443" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem" --launcher_dir="/home/vagrant/mesos/build-ssl/src" --logbufsecs="0" --logging_level="INFO" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5"
W1201 16:13:57.322767 30104 slave.cpp:195]
**************************************************
Slave bound to loopback interface! Cannot communicate with remote master(s). You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:57.322778 30104 slave.cpp:212] Moving slave process into its own cgroup for subsystem: memory
I1201 16:13:58.308562 30104 slave.cpp:212] Moving slave process into its own cgroup for subsystem: cpuacct
I1201 16:13:58.973721 30104 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential'
I1201 16:13:58.973891 30104 slave.cpp:322] Slave using credential for: test-principal
I1201 16:13:58.974534 30104 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:58.974582 30104 slave.cpp:400] Slave attributes: [  ]
I1201 16:13:58.974588 30104 slave.cpp:405] Slave hostname: centos71
I1201 16:13:58.974591 30104 slave.cpp:410] Slave checkpoint: true
I1201 16:13:58.975349 30104 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta'
I1201 16:13:58.975401 30104 state.cpp:681] No checkpointed resources found at '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/resources/resources.info'
I1201 16:13:58.975617 30104 state.cpp:85] Slave host rebooted
I1201 16:13:58.975929 30099 status_update_manager.cpp:202] Recovering status update manager
I1201 16:13:58.976277 30103 containerizer.cpp:384] Recovering containerizer
I1201 16:13:59.167284 30103 containerizer.cpp:522] Removing orphan container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.358265 30101 cgroups.cpp:2429] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.358578 30103 slave.cpp:4230] Finished recovery
I1201 16:13:59.359745 30103 slave.cpp:4263] Garbage collecting old slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:59.359963 30102 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0' for gc 6.99999583571852days in the future
I1201 16:13:59.360177 30102 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0' for gc 6.99999583361185days in the future
I1201 16:13:59.360265 30102 status_update_manager.cpp:176] Pausing sending status updates
I1201 16:13:59.360267 30099 slave.cpp:729] New master detected at master@127.0.0.1:52014
I1201 16:13:59.360348 30099 slave.cpp:792] Authenticating with master master@127.0.0.1:52014
I1201 16:13:59.360359 30099 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 16:13:59.360474 30099 slave.cpp:765] Detecting new master
I1201 16:13:59.360571 30102 authenticatee.cpp:123] Creating new client SASL connection
I1201 16:13:59.360900 30100 master.cpp:5150] Authenticating slave(2)@127.0.0.1:52014
I1201 16:13:59.361102 30100 authenticator.cpp:100] Creating new server SASL connection
I1201 16:13:59.361421 30105 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 16:13:59.361444 30105 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 16:13:59.361508 30105 authenticator.cpp:205] Received SASL authentication start
I1201 16:13:59.361547 30105 authenticator.cpp:327] Authentication requires more steps
I1201 16:13:59.361611 30105 authenticatee.cpp:260] Received SASL authentication step
I1201 16:13:59.361699 30102 authenticator.cpp:233] Received SASL authentication step
I1201 16:13:59.361757 30102 authenticator.cpp:319] Authentication success
I1201 16:13:59.361841 30105 authenticatee.cpp:300] Authentication success
I1201 16:13:59.361862 30102 master.cpp:5180] Successfully authenticated principal 'test-principal' at slave(2)@127.0.0.1:52014
I1201 16:13:59.362381 30105 slave.cpp:860] Successfully authenticated with master master@127.0.0.1:52014
I1201 16:13:59.362884 30104 master.cpp:3859] Registering slave at slave(2)@127.0.0.1:52014 (centos71) with id 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
I1201 16:13:59.363746 30105 registrar.cpp:441] Applied 1 operations in 69347ns; attempting to update the 'registry'
I1201 16:13:59.365444 30103 log.cpp:685] Attempting to append 486 bytes to the log
I1201 16:13:59.365583 30106 coordinator.cpp:350] Coordinator attempting to write APPEND action at position 5
I1201 16:13:59.366288 30100 replica.cpp:540] Replica received write request for position 5 from (68)@127.0.0.1:52014
I1201 16:13:59.366833 30100 leveldb.cpp:343] Persisting action (505 bytes) to leveldb took 513939ns
I1201 16:13:59.366853 30100 replica.cpp:715] Persisted action at 5
I1201 16:13:59.367413 30100 replica.cpp:694] Replica received learned notice for position 5 from @0.0.0.0:0
I1201 16:13:59.368077 30100 leveldb.cpp:343] Persisting action (507 bytes) to leveldb took 636735ns
I1201 16:13:59.368098 30100 replica.cpp:715] Persisted action at 5
I1201 16:13:59.368113 30100 replica.cpp:700] Replica learned APPEND action at position 5
I1201 16:13:59.369359 30099 registrar.cpp:486] Successfully updated the 'registry' in 5.55392ms
I1201 16:13:59.369951 30100 log.cpp:704] Attempting to truncate the log to 5
I1201 16:13:59.371310 30104 coordinator.cpp:350] Coordinator attempting to write TRUNCATE action at position 6
I1201 16:13:59.371423 30103 master.cpp:3927] Registered slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 at slave(2)@127.0.0.1:52014 (centos71) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:59.371464 30100 hierarchical.cpp:344] Added slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 (centos71) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
I1201 16:13:59.371532 30105 slave.cpp:904] Registered with master master@127.0.0.1:52014; given slave ID 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
I1201 16:13:59.371687 30103 status_update_manager.cpp:183] Resuming sending status updates
I1201 16:13:59.371963 30104 replica.cpp:540] Replica received write request for position 6 from (69)@127.0.0.1:52014
I1201 16:13:59.372654 30103 master.cpp:4979] Sending 1 offers to framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at scheduler-0497a7d6-a1f8-4d9e-908f-447e387b8804@127.0.0.1:52014
I1201 16:13:59.373229 30105 slave.cpp:963] Forwarding total oversubscribed resources
I1201 16:13:59.373275 30104 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 1.280817ms
I1201 16:13:59.373320 30104 replica.cpp:715] Persisted action at 6
I1201 16:13:59.373451 30105 master.cpp:4269] Received update of slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 at slave(2)@127.0.0.1:52014 (centos71) with total oversubscribed resources
I1201 16:13:59.373554 30085 sched.cpp:1805] Asked to stop the driver
I1201 16:13:59.373631 30105 hierarchical.cpp:400] Slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 (centos71) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I1201 16:13:59.373713 30104 sched.cpp:1043] Stopping framework '0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000'
I1201 16:13:59.373881 30104 replica.cpp:694] Replica received learned notice for position 6 from @0.0.0.0:0
I1201 16:13:59.374068 30085 master.cpp:922] Master terminating
I1201 16:13:59.374452 30099 hierarchical.cpp:373] Removed slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
W1201 16:13:59.374821 30085 master.cpp:6118] Removing task ccd11e20-3250-4007-b026-e665da093cd1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 on slave 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71) in non-terminal state TASK_RUNNING
I1201 16:13:59.377071 30099 slave.cpp:3215] master@127.0.0.1:52014 exited
W1201 16:13:59.377092 30099 slave.cpp:3218] Master disconnected! Waiting for a new master to be elected
I1201 16:13:59.381366 30104 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 7.454044ms
I1201 16:13:59.381429 30104 leveldb.cpp:401] Deleting ~2 keys from leveldb took 36702ns
I1201 16:13:59.381441 30104 replica.cpp:715] Persisted action at 6
I1201 16:13:59.381458 30104 replica.cpp:700] Replica learned TRUNCATE action at position 6
I1201 16:13:59.383708 30085 slave.cpp:601] Slave terminating
I1201 16:13:59.561939 30101 cgroups.cpp:1411] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9 after 203.608064ms
I1201 16:13:59.755419 30106 cgroups.cpp:2447] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.880949 30106 cgroups.cpp:2429] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.947829 30105 cgroups.cpp:1440] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9 after 192.301824ms
../../src/tests/mesos.cpp:781: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to kill tasks in nested cgroups: Collect failed: Failed to write 'FROZEN' to control 'freezer.state': 'mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9' is not a valid cgroup
[  FAILED  ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer (18301 ms)
{noformat}


was (Author: nfnt):
{noformat}
[ RUN      ] SlaveRecoveryTest/0.Reboot
I1201 15:59:03.294540 21012 exec.cpp:136] Version: 0.26.0
I1201 15:59:03.302486 21039 exec.cpp:210] Executor registered on slave b17072f2-ce17-4f80-aa41-2197194f7cd0-S0
Registered executor on centos71
Starting task 6060349a-ab26-45d2-a2fa-96e561f794a8
sh -c 'sleep 1000'
Forked command at 21048
I1201 15:59:03.420940 21044 exec.cpp:383] Executor asked to shutdown
Shutting down
Sending SIGTERM to process tree at pid 21048
Killing the following process trees:
[
--- 21048 sleep 1000
]
Command terminated with signal Terminated (pid: 21048)
../../src/tests/mesos.cpp:781: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to kill tasks in nested cgroups: Collect failed: Invalid freezer cgroup: 'mesos_test_d456f5bc-7718-4850-990e-8961404efd15/8fa1aee7-b393-4a20-85e1-33cd2fca0b10' is not a valid cgroup
[  FAILED  ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer (4835 ms)
{noformat}

> SlaveRecoveryTest/0.Reboot fails under CentOS 7.1 with libevent & SSL enabled.
> ------------------------------------------------------------------------------
>
>                 Key: MESOS-4032
>                 URL: https://issues.apache.org/jira/browse/MESOS-4032
>             Project: Mesos
>          Issue Type: Bug
>         Environment: CentOS 7.1, {{--enable-libevent --enable-ssl}}
>            Reporter: Jan Schlicht
>
> Running {{sudo ./bin/mesos-tests.sh}} has SlaveRecoveryTest/0.Reboot failing. A virtual env was used to run the tests.
> Vagrantfile generator:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.hostname = "centos71"
>   config.vm.box = "bento/centos-7.1"
>   config.vm.provider "virtualbox" do |vb|
>     vb.memory = 8192
>     vb.cpus = 8
>   end
>   config.vm.provision "shell", inline: <<-SHELL
>      yum -y update systemd
>      yum install -y tar wget
>      wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
>      yum groupinstall -y "Development Tools"
>      yum install -y apache-maven python-devel java-1.7.0-openjdk-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel cyrus-sasl-md5 apr-devel subversion-devel apr-util-devel
>      yum install -y libevent-devel
>      yum install -y perf nmap-ncat
>      yum install -y git
>      yum install -y docker
>      systemctl start docker
>      systemctl enable docker
>   SHELL
> end
> EOF
> vagrant up
> vagrant reload
> vagrant ssh -c "
> git clone  https://github.com/apache/mesos.git mesos
> cd mesos
> git checkout -b 0.26.0-rc2 0.26.0-rc2
> ./bootstrap
> mkdir build
> cd build
> ../configure --enable-libevent --enable-ssl
> GTEST_FILTER="" make check
> sudo ./bin/mesos-tests.sh
> "
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)