You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2015/10/29 02:33:27 UTC

[jira] [Comment Edited] (MESOS-3770) SlaveRecoveryTest/0.RecoverCompletedExecutor is flaky

    [ https://issues.apache.org/jira/browse/MESOS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979618#comment-14979618 ] 

Vinod Kone edited comment on MESOS-3770 at 10/29/15 1:33 AM:
-------------------------------------------------------------

The test is flaky because of an actual bug in the code which was introduced in MESOS-2864. I'll re-open it add details of the bug there.

The bug affects this test because the master and slave views of the task become inconsistent which results in an extra status update (TASK_LOST) during reconciliation between master and slave after slave restart.


was (Author: vinodkone):
The test is flaky because of an actual bug in the code which was introduced in MESOS-2864. I'll re-open it add details of the bug there.

> SlaveRecoveryTest/0.RecoverCompletedExecutor is flaky
> -----------------------------------------------------
>
>                 Key: MESOS-3770
>                 URL: https://issues.apache.org/jira/browse/MESOS-3770
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.26.0
>            Reporter: Vinod Kone
>            Assignee: Vinod Kone
>
> Observed this in internal CI
> {code}
> DEBUG: [ RUN      ] SlaveRecoveryTest/0.RecoverCompletedExecutor
> DEBUG: Using temporary directory '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B'
> DEBUG: I1020 08:56:36.634321 28115 leveldb.cpp:176] Opened db in 185.662339ms
> DEBUG: I1020 08:56:36.701638 28115 leveldb.cpp:183] Compacted db in 67.257643ms
> DEBUG: I1020 08:56:36.701705 28115 leveldb.cpp:198] Created db iterator in 8212ns
> DEBUG: I1020 08:56:36.701719 28115 leveldb.cpp:204] Seeked to beginning of db in 1417ns
> DEBUG: I1020 08:56:36.701730 28115 leveldb.cpp:273] Iterated through 0 keys in the db in 357ns
> DEBUG: I1020 08:56:36.701756 28115 replica.cpp:746] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> DEBUG: I1020 08:56:36.702062 28132 recover.cpp:449] Starting replica recovery
> DEBUG: I1020 08:56:36.702116 28132 recover.cpp:475] Replica is in EMPTY status
> DEBUG: I1020 08:56:36.702952 28132 replica.cpp:642] Replica in EMPTY status received a broadcasted recover request from (7143)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.703795 28141 recover.cpp:195] Received a recover response from a replica in EMPTY status
> DEBUG: I1020 08:56:36.704100 28138 recover.cpp:566] Updating replica status to STARTING
> DEBUG: I1020 08:56:36.705229 28133 master.cpp:376] Master 0d54e2f1-43d7-4f74-8532-9c37ac40791b (smfc-ahy-19-sr2.corpdc.twitter.com) started on 172.16.132.117:37586
> DEBUG: I1020 08:56:36.705289 28133 master.cpp:378] Flags at startup: --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" --registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/master" --zk_session_timeout="10secs"
> DEBUG: I1020 08:56:36.705440 28133 master.cpp:423] Master only allowing authenticated frameworks to register
> DEBUG: I1020 08:56:36.705446 28133 master.cpp:428] Master only allowing authenticated slaves to register
> DEBUG: I1020 08:56:36.705451 28133 credentials.hpp:37] Loading credentials for authentication from '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/credentials'
> DEBUG: I1020 08:56:36.705587 28133 master.cpp:467] Using default 'crammd5' authenticator
> DEBUG: I1020 08:56:36.705651 28133 master.cpp:504] Authorization enabled
> DEBUG: I1020 08:56:36.706521 28134 master.cpp:1609] The newly elected leader is master@172.16.132.117:37586 with id 0d54e2f1-43d7-4f74-8532-9c37ac40791b
> DEBUG: I1020 08:56:36.706539 28134 master.cpp:1622] Elected as the leading master!
> DEBUG: I1020 08:56:36.706545 28134 master.cpp:1382] Recovering from registrar
> DEBUG: I1020 08:56:36.706681 28146 registrar.cpp:309] Recovering registrar
> DEBUG: I1020 08:56:36.768453 28138 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 64.300669ms
> DEBUG: I1020 08:56:36.768492 28138 replica.cpp:323] Persisted replica status to STARTING
> DEBUG: I1020 08:56:36.768568 28138 recover.cpp:475] Replica is in STARTING status
> DEBUG: I1020 08:56:36.769737 28131 replica.cpp:642] Replica in STARTING status received a broadcasted recover request from (7144)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.769816 28131 recover.cpp:195] Received a recover response from a replica in STARTING status
> DEBUG: I1020 08:56:36.770355 28141 recover.cpp:566] Updating replica status to VOTING
> DEBUG: I1020 08:56:36.818709 28136 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 48.054479ms
> DEBUG: I1020 08:56:36.818743 28136 replica.cpp:323] Persisted replica status to VOTING
> DEBUG: I1020 08:56:36.818791 28136 recover.cpp:580] Successfully joined the Paxos group
> DEBUG: I1020 08:56:36.818842 28136 recover.cpp:464] Recover process terminated
> DEBUG: I1020 08:56:36.818954 28130 log.cpp:661] Attempting to start the writer
> DEBUG: I1020 08:56:36.820060 28140 replica.cpp:478] Replica received implicit promise request from (7145)@172.16.132.117:37586 with proposal 1
> DEBUG: I1020 08:56:36.885488 28140 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 65.391754ms
> DEBUG: I1020 08:56:36.885524 28140 replica.cpp:345] Persisted promised to 1
> DEBUG: I1020 08:56:36.885898 28148 coordinator.cpp:231] Coordinator attemping to fill missing position
> DEBUG: I1020 08:56:36.888197 28143 replica.cpp:378] Replica received explicit promise request from (7146)@172.16.132.117:37586 for position 0 with proposal 2
> DEBUG: I1020 08:56:36.952364 28143 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 64.11496ms
> DEBUG: I1020 08:56:36.952407 28143 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:36.953027 28139 replica.cpp:512] Replica received write request for position 0 from (7147)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.953059 28139 leveldb.cpp:438] Reading position from leveldb took 14203ns
> DEBUG: I1020 08:56:36.994161 28139 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 41.082908ms
> DEBUG: I1020 08:56:36.994199 28139 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:36.994415 28143 replica.cpp:660] Replica received learned notice for position 0 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.036006 28143 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 41.569884ms
> DEBUG: I1020 08:56:37.036043 28143 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:37.036056 28143 replica.cpp:666] Replica learned NOP action at position 0
> DEBUG: I1020 08:56:37.036365 28138 log.cpp:677] Writer started with ending position 0
> DEBUG: I1020 08:56:37.037250 28133 leveldb.cpp:438] Reading position from leveldb took 45165ns
> DEBUG: I1020 08:56:37.038321 28131 registrar.cpp:342] Successfully fetched the registry (0B) in 331.61216ms
> DEBUG: I1020 08:56:37.038378 28131 registrar.cpp:441] Applied 1 operations in 6704ns; attempting to update the 'registry'
> DEBUG: I1020 08:56:37.038996 28132 log.cpp:685] Attempting to append 228 bytes to the log
> DEBUG: I1020 08:56:37.039103 28132 coordinator.cpp:341] Coordinator attempting to write APPEND action at position 1
> DEBUG: I1020 08:56:37.039717 28152 replica.cpp:512] Replica received write request for position 1 from (7148)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.111176 28152 leveldb.cpp:343] Persisting action (247 bytes) to leveldb took 71.428059ms
> DEBUG: I1020 08:56:37.111225 28152 replica.cpp:681] Persisted action at 1
> DEBUG: I1020 08:56:37.111547 28138 replica.cpp:660] Replica received learned notice for position 1 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.177995 28138 leveldb.cpp:343] Persisting action (249 bytes) to leveldb took 66.426857ms
> DEBUG: I1020 08:56:37.178033 28138 replica.cpp:681] Persisted action at 1
> DEBUG: I1020 08:56:37.178045 28138 replica.cpp:666] Replica learned APPEND action at position 1
> DEBUG: I1020 08:56:37.178407 28137 registrar.cpp:486] Successfully updated the 'registry' in 139.99104ms
> DEBUG: I1020 08:56:37.178473 28137 registrar.cpp:372] Successfully recovered registrar
> DEBUG: I1020 08:56:37.178496 28146 log.cpp:704] Attempting to truncate the log to 1
> DEBUG: I1020 08:56:37.178966 28145 master.cpp:1419] Recovered 0 slaves from the Registry (189B) ; allowing 10mins for slaves to re-register
> DEBUG: I1020 08:56:37.178951 28140 coordinator.cpp:341] Coordinator attempting to write TRUNCATE action at position 2
> DEBUG: I1020 08:56:37.180043 28132 replica.cpp:512] Replica received write request for position 2 from (7149)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.219796 28132 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 39.72368ms
> DEBUG: I1020 08:56:37.219825 28132 replica.cpp:681] Persisted action at 2
> DEBUG: I1020 08:56:37.219990 28132 replica.cpp:660] Replica received learned notice for position 2 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.278291 28132 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 58.280037ms
> DEBUG: I1020 08:56:37.278342 28132 leveldb.cpp:401] Deleting ~1 keys from leveldb took 21056ns
> DEBUG: I1020 08:56:37.278355 28132 replica.cpp:681] Persisted action at 2
> DEBUG: I1020 08:56:37.278363 28132 replica.cpp:666] Replica learned TRUNCATE action at position 2
> DEBUG: I1020 08:56:37.281244 28115 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
> DEBUG: W1020 08:56:37.281390 28115 backend.cpp:50] Failed to create 'bind' backend: BindBackend requires root privileges
> DEBUG: I1020 08:56:37.284584 28153 slave.cpp:191] Slave started on 255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.284600 28153 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --egress_unique_flow_per_container="false" --enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/builddir/build/BUILD/mesos-0.26.0/src" --logbufsecs="0" --logging_level="INFO" --network_enable_socket_statistics_details="false" --network_enable_socket_statistics_summary="false" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx"
> DEBUG: I1020 08:56:37.284816 28153 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential'
> DEBUG: I1020 08:56:37.284910 28153 slave.cpp:322] Slave using credential for: test-principal
> DEBUG: I1020 08:56:37.285879 28153 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:37.285928 28153 slave.cpp:400] Slave attributes: [  ]
> DEBUG: I1020 08:56:37.285943 28153 slave.cpp:405] Slave hostname: smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: I1020 08:56:37.285955 28153 slave.cpp:410] Slave checkpoint: true
> DEBUG: I1020 08:56:37.286746 28115 sched.cpp:166] Version: 0.26.0-rc0
> DEBUG: I1020 08:56:37.286973 28151 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta'
> DEBUG: I1020 08:56:37.287063 28143 status_update_manager.cpp:202] Recovering status update manager
> DEBUG: I1020 08:56:37.287384 28137 containerizer.cpp:385] Recovering containerizer
> DEBUG: I1020 08:56:37.287608 28139 sched.cpp:264] New master detected at master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.287746 28139 sched.cpp:320] Authenticating with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.287757 28139 sched.cpp:327] Using default CRAM-MD5 authenticatee
> DEBUG: I1020 08:56:37.287824 28135 slave.cpp:4222] Finished recovery
> DEBUG: I1020 08:56:37.287967 28149 authenticatee.cpp:115] Creating new client SASL connection
> DEBUG: I1020 08:56:37.288192 28139 master.cpp:5144] Authenticating scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:37.288382 28136 status_update_manager.cpp:176] Pausing sending status updates
> DEBUG: I1020 08:56:37.288444 28140 slave.cpp:726] New master detected at master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.288467 28140 slave.cpp:789] Authenticating with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.288475 28140 slave.cpp:794] Using default CRAM-MD5 authenticatee
> DEBUG: I1020 08:56:37.288508 28140 slave.cpp:762] Detecting new master
> DEBUG: I1020 08:56:37.288604 28130 authenticatee.cpp:115] Creating new client SASL connection
> DEBUG: I1020 08:56:37.288717 28140 authenticator.cpp:92] Creating new server SASL connection
> DEBUG: I1020 08:56:37.288785 28139 master.cpp:5144] Authenticating slave(255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.288861 28135 authenticatee.cpp:206] Received SASL authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:37.288882 28135 authenticatee.cpp:232] Attempting to authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:37.289080 28133 authenticator.cpp:92] Creating new server SASL connection
> DEBUG: I1020 08:56:37.289175 28131 authenticatee.cpp:206] Received SASL authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:37.289199 28131 authenticatee.cpp:232] Attempting to authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:37.289355 28144 authenticator.cpp:197] Received SASL authentication start
> DEBUG: I1020 08:56:37.289371 28140 authenticator.cpp:197] Received SASL authentication start
> DEBUG: I1020 08:56:37.289440 28144 authenticator.cpp:319] Authentication requires more steps
> DEBUG: I1020 08:56:37.289459 28140 authenticator.cpp:319] Authentication requires more steps
> DEBUG: I1020 08:56:37.289480 28144 authenticatee.cpp:252] Received SASL authentication step
> DEBUG: I1020 08:56:37.289489 28140 authenticatee.cpp:252] Received SASL authentication step
> DEBUG: I1020 08:56:37.289813 28132 authenticator.cpp:225] Received SASL authentication step
> DEBUG: I1020 08:56:37.289852 28132 authenticator.cpp:311] Authentication success
> DEBUG: I1020 08:56:37.289898 28143 authenticatee.cpp:292] Authentication success
> DEBUG: I1020 08:56:37.289971 28142 master.cpp:5174] Successfully authenticated principal 'test-principal' at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290009 28143 sched.cpp:409] Successfully authenticated with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290163 28137 authenticator.cpp:225] Received SASL authentication step
> DEBUG: I1020 08:56:37.290220 28137 authenticator.cpp:311] Authentication success
> DEBUG: I1020 08:56:37.290266 28134 master.cpp:2185] Received SUBSCRIBE call for framework 'default' at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290289 28134 master.cpp:1648] Authorizing framework principal 'test-principal' to receive offers for role '*'
> DEBUG: I1020 08:56:37.290405 28153 authenticatee.cpp:292] Authentication success
> DEBUG: I1020 08:56:37.290431 28144 master.cpp:5174] Successfully authenticated principal 'test-principal' at slave(255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290464 28144 master.cpp:2256] Subscribing framework default with checkpointing enabled and capabilities [  ]
> DEBUG: I1020 08:56:37.290654 28136 hierarchical.cpp:185] Added framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.290650 28148 slave.cpp:857] Successfully authenticated with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290817 28144 master.cpp:3868] Registering slave at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) with id 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.290873 28151 sched.cpp:643] Framework registered with 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.290930 28130 registrar.cpp:441] Applied 1 operations in 16086ns; attempting to update the 'registry'
> DEBUG: I1020 08:56:37.291250 28151 log.cpp:685] Attempting to append 419 bytes to the log
> DEBUG: I1020 08:56:37.291321 28139 coordinator.cpp:341] Coordinator attempting to write APPEND action at position 3
> DEBUG: I1020 08:56:37.291939 28131 replica.cpp:512] Replica received write request for position 3 from (7161)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.300041 28136 master.cpp:3856] Ignoring register slave message from slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.311420 28141 master.cpp:3856] Ignoring register slave message from slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.341991 28130 master.cpp:3856] Ignoring register slave message from slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.354004 28131 leveldb.cpp:343] Persisting action (438 bytes) to leveldb took 62.03948ms
> DEBUG: I1020 08:56:37.354042 28131 replica.cpp:681] Persisted action at 3
> DEBUG: I1020 08:56:37.354390 28136 replica.cpp:660] Replica received learned notice for position 3 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.378435 28130 master.cpp:3856] Ignoring register slave message from slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.395829 28136 leveldb.cpp:343] Persisting action (440 bytes) to leveldb took 41.417781ms
> DEBUG: I1020 08:56:37.395867 28136 replica.cpp:681] Persisted action at 3
> DEBUG: I1020 08:56:37.395879 28136 replica.cpp:666] Replica learned APPEND action at position 3
> DEBUG: I1020 08:56:37.396229 28136 registrar.cpp:486] Successfully updated the 'registry' in 105.257984ms
> DEBUG: I1020 08:56:37.396466 28148 log.cpp:704] Attempting to truncate the log to 3
> DEBUG: I1020 08:56:37.396519 28151 coordinator.cpp:341] Coordinator attempting to write TRUNCATE action at position 4
> DEBUG: I1020 08:56:37.396827 28136 master.cpp:3936] Registered slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:37.396867 28137 slave.cpp:901] Registered with master master@172.16.132.117:37586; given slave ID 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.396874 28152 hierarchical.cpp:335] Added slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
> DEBUG: I1020 08:56:37.397186 28150 master.cpp:4973] Sending 1 offers to framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:37.397260 28132 status_update_manager.cpp:183] Resuming sending status updates
> DEBUG: I1020 08:56:37.397285 28136 replica.cpp:512] Replica received write request for position 4 from (7162)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.397516 28137 slave.cpp:960] Forwarding total oversubscribed resources 
> DEBUG: I1020 08:56:37.397769 28150 master.cpp:4278] Received update of slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) with total oversubscribed resources 
> DEBUG: I1020 08:56:37.397902 28149 hierarchical.cpp:391] Slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
> DEBUG: I1020 08:56:37.399044 28152 master.cpp:2924] Processing ACCEPT call for offers: [ 0d54e2f1-43d7-4f74-8532-9c37ac40791b-O0 ] on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:37.399066 28152 master.cpp:2720] Authorizing framework principal 'test-principal' to launch task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b as user 'mockbuild'
> DEBUG: I1020 08:56:37.399948 28147 master.hpp:176] Adding task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.399991 28147 master.cpp:3254] Launching task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.400190 28147 slave.cpp:1291] Got assigned task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.409716 28147 slave.cpp:1407] Launching task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.410130 28147 paths.cpp:417] Trying to chown '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8' to user 'mockbuild'
> DEBUG: I1020 08:56:37.418064 28147 slave.cpp:4994] Launching executor 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:37.418360 28137 containerizer.cpp:639] Starting container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> DEBUG: I1020 08:56:37.418512 28147 slave.cpp:1625] Queuing task '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for executor 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.421762 28143 launcher.cpp:132] Forked child with pid '40438' for container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:37.421979 28143 containerizer.cpp:872] Checkpointing executor's forked pid 40438 to '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8/pids/forked.pid'
> DEBUG: I1020 08:56:37.463163 28136 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 65.83211ms
> DEBUG: I1020 08:56:37.463227 28136 replica.cpp:681] Persisted action at 4
> DEBUG: I1020 08:56:37.464862 28136 replica.cpp:660] Replica received learned notice for position 4 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.504696 40438 exec.cpp:136] Version: 0.26.0-rc0
> DEBUG: I1020 08:56:37.506574 28150 slave.cpp:2415] Got registration for executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.575783 40482 exec.cpp:210] Executor registered on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.576037 28147 slave.cpp:1796] Sending queued task '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' to executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: Registered executor on smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: Starting task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b
> DEBUG: sh -c 'exit 0'
> DEBUG: Forked command at 40500
> DEBUG: I1020 08:56:37.580664 28139 slave.cpp:2778] Handling status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.580994 28146 status_update_manager.cpp:322] Received status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.581665 28146 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.623446 28136 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 158.550858ms
> DEBUG: Command exited with status 0 (pid: 40500)
> DEBUG: I1020 08:56:37.623538 28136 leveldb.cpp:401] Deleting ~2 keys from leveldb took 41621ns
> DEBUG: I1020 08:56:37.623558 28136 replica.cpp:681] Persisted action at 4
> DEBUG: I1020 08:56:37.623571 28136 replica.cpp:666] Replica learned TRUNCATE action at position 4
> DEBUG: I1020 08:56:37.681676 28145 slave.cpp:2778] Handling status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.840840 28146 status_update_manager.cpp:322] Received status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.840920 28146 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.840901 28148 slave.cpp:3090] Forwarding the update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to master@172.16.132.117:37586
> DEBUG: I1020 08:56:37.841084 28148 slave.cpp:3020] Sending acknowledgement for status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.841506 28153 master.cpp:4421] Status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.841531 28153 master.cpp:4460] Forwarding status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.841815 28153 master.cpp:6086] Updating the latest state of task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to TASK_FINISHED (status update state: TASK_RUNNING)
> DEBUG: I1020 08:56:37.842349 28131 hierarchical.cpp:739] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 from framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.843065 28152 master.cpp:3612] Processing ACKNOWLEDGE call fb1a8b73-5136-4a8f-ae92-9d26b13fb41b for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586 on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.932744 28146 status_update_manager.cpp:394] Received status update acknowledgement (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.932793 28146 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.932839 28152 slave.cpp:3020] Sending acknowledgement for status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to executor(1)@172.16.132.117:41415
> DEBUG: 2015-10-20 08:56:37,948:28115(0x7f526963c700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:34917] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
> DEBUG: I1020 08:56:38.091502 28144 slave.cpp:3090] Forwarding the update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.091954 28152 master.cpp:4421] Status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.092000 28152 master.cpp:4460] Forwarding status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: E1020 08:56:38.092056 28152 master.cpp:6032] Ignoring status update for the terminated task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b (TASK_FINISHED -> TASK_FINISHED) of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.092725 28153 master.cpp:3612] Processing ACKNOWLEDGE call 4a38e137-c590-48f6-ba10-2e0ba4903ac5 for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586 on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:38.092818 28153 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.092852 28153 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.691695 28137 containerizer.cpp:1278] Executor for container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8' has exited
> DEBUG: I1020 08:56:38.691723 28137 containerizer.cpp:1095] Destroying container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:38.708005 28138 master.cpp:4973] Sending 1 offers to framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:38.715711 28141 slave.cpp:3542] Executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 exited with status 0
> DEBUG: I1020 08:56:38.715741 28141 slave.cpp:3646] Cleaning up executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: tests/slave_recovery_tests.cpp:881: Failure
> DEBUG: Mock function called more times than expected - returning directly.
> DEBUG:     Function call: statusUpdate(0x7fffabf3b0c0, @0x7f52f0b9d740 112-byte object <F0-C1 40-FC 52-7F 00-00 00-00 00-00 00-00 00-00 B0-24 01-AC 52-7F 00-00 00-95 01-AC 52-7F 00-00 05-00 00-00 01-00 00-00 80-28 40-01 00-00 00-00 40-AF 01-AC 52-7F 00-00 00-00 00-00 00-00 00-00 09-00 00-00 00-00 00-00 8B-FB 6F-FD 6F-90 D5-41 20-AF 01-AC 52-7F 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 5F-03 00-00>)
> DEBUG:          Expected: to be called twice
> DEBUG:            Actual: called 3 times - over-saturated and active
> DEBUG: I1020 08:56:38.716214 28141 slave.cpp:3735] Cleaning up framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.716245 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for gc 6.99999171211852days in the future
> DEBUG: I1020 08:56:38.716320 28151 status_update_manager.cpp:284] Closing status update streams for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.716517 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for gc 6.99999171165333days in the future
> DEBUG: I1020 08:56:38.716562 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for gc 6.99999171135407days in the future
> DEBUG: I1020 08:56:38.716585 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for gc 6.99999171074074days in the future
> DEBUG: I1020 08:56:38.716604 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000' for gc 6.99999171001185days in the future
> DEBUG: I1020 08:56:38.716624 28131 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000' for gc 6.9999917082637days in the future
> DEBUG: I1020 08:56:38.716841 28115 slave.cpp:606] Slave terminating
> DEBUG: I1020 08:56:38.716974 28150 master.cpp:1086] Slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) disconnected
> DEBUG: I1020 08:56:38.716994 28150 master.cpp:2540] Disconnecting slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.717082 28150 master.cpp:2559] Deactivating slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.717128 28136 hierarchical.cpp:420] Slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 deactivated
> DEBUG: I1020 08:56:38.717308 28136 hierarchical.cpp:739] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 from framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.718613 28115 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
> DEBUG: W1020 08:56:38.718729 28115 backend.cpp:50] Failed to create 'bind' backend: BindBackend requires root privileges
> DEBUG: I1020 08:56:38.720690 28142 slave.cpp:191] Slave started on 256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.720705 28142 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --egress_unique_flow_per_container="false" --enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/builddir/build/BUILD/mesos-0.26.0/src" --logbufsecs="0" --logging_level="INFO" --network_enable_socket_statistics_details="false" --network_enable_socket_statistics_summary="false" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx"
> DEBUG: I1020 08:56:38.720965 28142 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential'
> DEBUG: I1020 08:56:38.721045 28142 slave.cpp:322] Slave using credential for: test-principal
> DEBUG: I1020 08:56:38.721573 28142 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:38.721603 28142 slave.cpp:400] Slave attributes: [  ]
> DEBUG: I1020 08:56:38.721609 28142 slave.cpp:405] Slave hostname: smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: I1020 08:56:38.721613 28142 slave.cpp:410] Slave checkpoint: true
> DEBUG: I1020 08:56:38.721895 28131 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta'
> DEBUG: I1020 08:56:38.721935 28131 state.cpp:681] No checkpointed resources found at '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/resources/resources.info'
> DEBUG: I1020 08:56:38.723804 28131 slave.cpp:4310] Recovering framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.723827 28131 slave.cpp:5103] Recovering executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724068 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for gc 6.99999162064889days in the future
> DEBUG: I1020 08:56:38.724123 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for gc 6.99999161984889days in the future
> DEBUG: I1020 08:56:38.724180 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for gc 6.99999161965333days in the future
> DEBUG: I1020 08:56:38.724195 28131 slave.cpp:3735] Cleaning up framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724220 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for gc 6.99999161943111days in the future
> DEBUG: I1020 08:56:38.724318 28142 status_update_manager.cpp:284] Closing status update streams for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724431 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000' for gc 6.99999161661037days in the future
> DEBUG: I1020 08:56:38.724465 28142 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000' for gc 6.99999161520889days in the future
> DEBUG: I1020 08:56:38.724778 28140 status_update_manager.cpp:202] Recovering status update manager
> DEBUG: I1020 08:56:38.724791 28140 status_update_manager.cpp:210] Recovering executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.725278 28132 containerizer.cpp:385] Recovering containerizer
> DEBUG: I1020 08:56:38.725953 28142 slave.cpp:4222] Finished recovery
> DEBUG: I1020 08:56:38.726372 28131 status_update_manager.cpp:176] Pausing sending status updates
> DEBUG: I1020 08:56:38.726676 28133 slave.cpp:726] New master detected at master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.726698 28133 slave.cpp:789] Authenticating with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.726706 28133 slave.cpp:794] Using default CRAM-MD5 authenticatee
> DEBUG: I1020 08:56:38.726750 28133 slave.cpp:762] Detecting new master
> DEBUG: I1020 08:56:38.726832 28153 authenticatee.cpp:115] Creating new client SASL connection
> DEBUG: I1020 08:56:38.727023 28153 master.cpp:5144] Authenticating slave(256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.727404 28149 authenticator.cpp:92] Creating new server SASL connection
> DEBUG: I1020 08:56:38.727483 28149 authenticatee.cpp:206] Received SASL authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:38.727496 28149 authenticatee.cpp:232] Attempting to authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:38.727525 28149 authenticator.cpp:197] Received SASL authentication start
> DEBUG: I1020 08:56:38.727591 28149 authenticator.cpp:319] Authentication requires more steps
> DEBUG: I1020 08:56:38.727670 28135 authenticatee.cpp:252] Received SASL authentication step
> DEBUG: I1020 08:56:38.727736 28135 authenticator.cpp:225] Received SASL authentication step
> DEBUG: I1020 08:56:38.727766 28135 authenticator.cpp:311] Authentication success
> DEBUG: I1020 08:56:38.727821 28147 authenticatee.cpp:292] Authentication success
> DEBUG: I1020 08:56:38.727826 28132 master.cpp:5174] Successfully authenticated principal 'test-principal' at slave(256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.728272 28144 slave.cpp:857] Successfully authenticated with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.728466 28153 master.cpp:4028] Re-registering slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: W1020 08:56:38.728508 28153 master.cpp:5241] Task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 unknown to the slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) during re-registration: reconciling with the slave
> DEBUG: I1020 08:56:38.728565 28153 master.cpp:4216] Sending updated checkpointed resources  to slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.728612 28145 slave.cpp:1001] Re-registered with master master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.728628 28140 hierarchical.cpp:408] Slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 reactivated
> DEBUG: I1020 08:56:38.728693 28145 slave.cpp:1037] Forwarding total oversubscribed resources 
> DEBUG: I1020 08:56:38.728703 28132 status_update_manager.cpp:183] Resuming sending status updates
> DEBUG: W1020 08:56:38.728718 28145 slave.cpp:1077] Slave reconciling task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 in state TASK_LOST: task unknown to the slave
> DEBUG: I1020 08:56:38.728761 28135 master.cpp:4278] Received update of slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) with total oversubscribed resources 
> DEBUG: I1020 08:56:38.728778 28141 status_update_manager.cpp:322] Received status update TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.728927 28135 hierarchical.cpp:391] Slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: )
> DEBUG: I1020 08:56:38.728973 28145 slave.cpp:2284] Updated checkpointed resources from  to 
> DEBUG: I1020 08:56:38.729003 28145 slave.cpp:3090] Forwarding the update TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to master@172.16.132.117:37586
> DEBUG: I1020 08:56:38.729135 28148 master.cpp:4973] Sending 1 offers to framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at scheduler-560af120-1924-4956-b664-3c4a67d0fe93@172.16.132.117:37586
> DEBUG: I1020 08:56:38.729395 28148 master.cpp:4421] Status update TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.729423 28148 master.cpp:4460] Forwarding status update TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: E1020 08:56:38.729467 28148 master.cpp:6032] Ignoring status update for the terminated task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b (TASK_FINISHED -> TASK_LOST) of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.729768 28115 sched.cpp:1781] Asked to stop the driver
> DEBUG: I1020 08:56:38.729800 28115 master.cpp:925] Master terminating
> DEBUG: I1020 08:56:38.729825 28115 master.cpp:6154] Removing task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 on slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.729920 28137 hierarchical.cpp:364] Removed slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:38.729949 28135 sched.cpp:1042] Stopping framework '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> DEBUG: I1020 08:56:38.730121 28137 hierarchical.cpp:220] Removed framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.730507 28135 slave.cpp:3258] master@172.16.132.117:37586 exited
> DEBUG: W1020 08:56:38.730526 28135 slave.cpp:3261] Master disconnected! Waiting for a new master to be elected
> DEBUG: [  FAILED  ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer (2288 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)