You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Alexander Rukletsov (JIRA)" <ji...@apache.org> on 2017/10/27 14:42:00 UTC

[jira] [Updated] (MESOS-976) SlaveRecoveryTest/1.SchedulerFailover is flaky

     [ https://issues.apache.org/jira/browse/MESOS-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Rukletsov updated MESOS-976:
--------------------------------------
    Labels: flaky flaky-test  (was: flaky)

> SlaveRecoveryTest/1.SchedulerFailover is flaky
> ----------------------------------------------
>
>                 Key: MESOS-976
>                 URL: https://issues.apache.org/jira/browse/MESOS-976
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.18.0
>            Reporter: Vinod Kone
>              Labels: flaky, flaky-test
>
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from SlaveRecoveryTest/1, where TypeParam = mesos::internal::slave::CgroupsIsolator
> [ RUN      ] SlaveRecoveryTest/1.SchedulerFailover
> I0206 20:18:31.525116 56447 master.cpp:239] Master ID: 2014-02-06-20:18:31-1740121354-55566-56447 Hostname: smfd-bkq-03-sr4.devel.twitter.com
> I0206 20:18:31.525295 56481 master.cpp:321] Master started on 10.37.184.103:55566
> I0206 20:18:31.525315 56481 master.cpp:324] Master only allowing authenticated frameworks to register!
> I0206 20:18:31.527093 56481 master.cpp:756] The newly elected leader is master@10.37.184.103:55566
> I0206 20:18:31.527122 56481 master.cpp:764] Elected as the leading master!
> I0206 20:18:31.530642 56473 slave.cpp:112] Slave started on 9)@10.37.184.103:55566
> I0206 20:18:31.530802 56473 slave.cpp:212] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0206 20:18:31.531203 56473 slave.cpp:240] Slave hostname: smfd-bkq-03-sr4.devel.twitter.com
> I0206 20:18:31.531221 56473 slave.cpp:241] Slave checkpoint: true
> I0206 20:18:31.531991 56482 cgroups_isolator.cpp:225] Using /tmp/mesos_test_cgroup as cgroups hierarchy root
> I0206 20:18:31.532470 56478 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta'
> I0206 20:18:31.532698 56469 status_update_manager.cpp:188] Recovering status update manager
> I0206 20:18:31.533962 56472 sched.cpp:265] Authenticating with master master@10.37.184.103:55566
> I0206 20:18:31.534102 56472 sched.cpp:234] Detecting new master
> I0206 20:18:31.534124 56484 authenticatee.hpp:124] Creating new client SASL connection
> I0206 20:18:31.534299 56473 master.cpp:2317] Authenticating framework at scheduler(9)@10.37.184.103:55566
> I0206 20:18:31.534459 56461 authenticator.hpp:140] Creating new server SASL connection
> I0206 20:18:31.534572 56466 authenticatee.hpp:212] Received SASL authentication mechanisms: CRAM-MD5
> I0206 20:18:31.534595 56466 authenticatee.hpp:238] Attempting to authenticate with mechanism 'CRAM-MD5'
> I0206 20:18:31.534667 56474 authenticator.hpp:243] Received SASL authentication start
> I0206 20:18:31.534732 56474 authenticator.hpp:325] Authentication requires more steps
> I0206 20:18:31.534814 56468 authenticatee.hpp:258] Received SASL authentication step
> I0206 20:18:31.534946 56466 authenticator.hpp:271] Received SASL authentication step
> I0206 20:18:31.535007 56466 authenticator.hpp:317] Authentication success
> I0206 20:18:31.535084 56471 authenticatee.hpp:298] Authentication success
> I0206 20:18:31.535107 56461 master.cpp:2357] Successfully authenticated framework at scheduler(9)@10.37.184.103:55566
> I0206 20:18:31.535392 56476 sched.cpp:339] Successfully authenticated with master master@10.37.184.103:55566
> I0206 20:18:31.535512 56465 master.cpp:812] Received registration request from scheduler(9)@10.37.184.103:55566
> I0206 20:18:31.535570 56465 master.cpp:830] Registering framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 at scheduler(9)@10.37.184.103:55566
> I0206 20:18:31.535856 56465 hierarchical_allocator_process.hpp:332] Added framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.537802 56482 cgroups_isolator.cpp:840] Recovering isolator
> I0206 20:18:31.538462 56472 slave.cpp:2760] Finished recovery
> I0206 20:18:31.538910 56472 slave.cpp:508] New master detected at master@10.37.184.103:55566
> I0206 20:18:31.539036 56478 status_update_manager.cpp:162] New master detected at master@10.37.184.103:55566
> I0206 20:18:31.539223 56464 master.cpp:1834] Attempting to register slave on smfd-bkq-03-sr4.devel.twitter.com at slave(9)@10.37.184.103:55566
> I0206 20:18:31.539271 56472 slave.cpp:533] Detecting new master
> I0206 20:18:31.539330 56464 master.cpp:2804] Adding slave 2014-02-06-20:18:31-1740121354-55566-56447-0 at smfd-bkq-03-sr4.devel.twitter.com with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0206 20:18:31.539454 56472 slave.cpp:551] Registered with master master@10.37.184.103:55566; given slave ID 2014-02-06-20:18:31-1740121354-55566-56447-0
> I0206 20:18:31.539620 56472 slave.cpp:564] Checkpointing SlaveInfo to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/slave.info'
> I0206 20:18:31.539834 56475 hierarchical_allocator_process.hpp:445] Added slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0206 20:18:31.540341 56472 master.cpp:2272] Sending 1 offers to framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.543433 56472 master.cpp:1568] Processing reply for offers: [ 2014-02-06-20:18:31-1740121354-55566-56447-0 ] on slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com) for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.543642 56472 master.hpp:411] Adding task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
> I0206 20:18:31.543781 56472 master.cpp:2441] Launching task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
> I0206 20:18:31.544002 56484 slave.cpp:736] Got assigned task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.544097 56484 slave.cpp:2899] Checkpointing FrameworkInfo to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.info'
> I0206 20:18:31.544272 56484 slave.cpp:2906] Checkpointing framework pid 'scheduler(9)@10.37.184.103:55566' to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid'
> I0206 20:18:31.544617 56484 slave.cpp:845] Launching task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.546721 56484 slave.cpp:3169] Checkpointing ExecutorInfo to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/executor.info'
> I0206 20:18:31.547317 56484 slave.cpp:3257] Checkpointing TaskInfo to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/tasks/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/task.info'
> I0206 20:18:31.547514 56484 slave.cpp:955] Queuing task 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework '2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.547590 56481 cgroups_isolator.cpp:517] Launching d045a0bd-2ed2-410a-bd1f-5bd9219896e3 (/home/vinod/mesos/build/src/mesos-executor) in /tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 in cgroup mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
> I0206 20:18:31.548408 56481 cgroups_isolator.cpp:717] Changing cgroup controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0206 20:18:31.548833 56481 cgroups_isolator.cpp:1007] Updated 'cpu.shares' to 2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.549294 56481 cgroups_isolator.cpp:1117] Updated 'memory.soft_limit_in_bytes' to 1GB for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.550107 56481 cgroups_isolator.cpp:1147] Updated 'memory.limit_in_bytes' to 1GB for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.550571 56481 cgroups_isolator.cpp:1174] Started listening for OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.551553 56481 cgroups_isolator.cpp:569] Forked executor at = 56671
> Checkpointing executor's forked pid 56671 to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/forked.pid'
> I0206 20:18:31.552222 56472 slave.cpp:2098] Monitoring executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 forked at pid 56671
> Fetching resources into '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986'
> I0206 20:18:31.604012 56472 slave.cpp:1431] Got registration for executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.604167 56472 slave.cpp:1516] Checkpointing executor pid 'executor(1)@10.37.184.103:46181' to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/libprocess.pid'
> I0206 20:18:31.605183 56472 slave.cpp:1552] Flushing queued task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> Registered executor on smfd-bkq-03-sr4.devel.twitter.com
> Starting task d045a0bd-2ed2-410a-bd1f-5bd9219896e3
> sh -c 'sleep 1000'
> Forked command at 56712
> I0206 20:18:31.613098 56481 slave.cpp:1765] Handling status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from executor(1)@10.37.184.103:46181
> I0206 20:18:31.613628 56469 status_update_manager.cpp:314] Received status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.614006 56469 status_update_manager.hpp:342] Checkpointing UPDATE for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.795529 56469 status_update_manager.cpp:367] Forwarding status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 to master@10.37.184.103:55566
> I0206 20:18:31.795992 56480 slave.cpp:1890] Sending acknowledgement for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 to executor(1)@10.37.184.103:46181
> I0206 20:18:31.796131 56471 master.cpp:2020] Status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from slave(9)@10.37.184.103:55566
> I0206 20:18:31.797099 56483 status_update_manager.cpp:392] Received status update acknowledgement (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.797165 56483 status_update_manager.hpp:342] Checkpointing ACK for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.882767 56481 slave.cpp:394] Slave terminating
> I0206 20:18:31.883112 56481 master.cpp:641] Slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com) disconnected
> I0206 20:18:31.883200 56476 hierarchical_allocator_process.hpp:484] Slave 2014-02-06-20:18:31-1740121354-55566-56447-0 disconnected
> I0206 20:18:31.888206 56473 sched.cpp:265] Authenticating with master master@10.37.184.103:55566
> I0206 20:18:31.888473 56473 sched.cpp:234] Detecting new master
> I0206 20:18:31.888556 56469 authenticatee.hpp:124] Creating new client SASL connection
> I0206 20:18:31.888978 56484 master.cpp:2317] Authenticating framework at scheduler(10)@10.37.184.103:55566
> I0206 20:18:31.889348 56469 authenticator.hpp:140] Creating new server SASL connection
> I0206 20:18:31.889925 56469 authenticatee.hpp:212] Received SASL authentication mechanisms: CRAM-MD5
> I0206 20:18:31.889989 56469 authenticatee.hpp:238] Attempting to authenticate with mechanism 'CRAM-MD5'
> I0206 20:18:31.890059 56469 authenticator.hpp:243] Received SASL authentication start
> I0206 20:18:31.890233 56469 authenticator.hpp:325] Authentication requires more steps
> I0206 20:18:31.890399 56468 authenticatee.hpp:258] Received SASL authentication step
> I0206 20:18:31.890554 56484 authenticator.hpp:271] Received SASL authentication step
> I0206 20:18:31.890630 56484 authenticator.hpp:317] Authentication success
> I0206 20:18:31.890728 56470 authenticatee.hpp:298] Authentication success
> I0206 20:18:31.890748 56484 master.cpp:2357] Successfully authenticated framework at scheduler(10)@10.37.184.103:55566
> I0206 20:18:31.892210 56469 sched.cpp:339] Successfully authenticated with master master@10.37.184.103:55566
> I0206 20:18:31.892410 56473 master.cpp:900] Re-registering framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 at scheduler(10)@10.37.184.103:55566
> I0206 20:18:31.892460 56473 master.cpp:926] Framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 failed over
> W0206 20:18:31.892691 56465 master.cpp:1048] Ignoring deactivate framework message for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from 'scheduler(9)@10.37.184.103:55566' because it is not from the registered framework 'scheduler(10)@10.37.184.103:55566'
> I0206 20:18:31.897049 56466 slave.cpp:112] Slave started on 10)@10.37.184.103:55566
> I0206 20:18:31.897207 56466 slave.cpp:212] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0206 20:18:31.897536 56466 slave.cpp:240] Slave hostname: smfd-bkq-03-sr4.devel.twitter.com
> I0206 20:18:31.897554 56466 slave.cpp:241] Slave checkpoint: true
> I0206 20:18:31.898388 56463 cgroups_isolator.cpp:225] Using /tmp/mesos_test_cgroup as cgroups hierarchy root
> I0206 20:18:31.898936 56472 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta'
> I0206 20:18:31.901702 56465 slave.cpp:2828] Recovering framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.901759 56465 slave.cpp:3020] Recovering executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:31.902716 56464 status_update_manager.cpp:188] Recovering status update manager
> I0206 20:18:31.902884 56464 status_update_manager.cpp:196] Recovering executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.475915 56463 cgroups_isolator.cpp:840] Recovering isolator
> I0206 20:18:34.476066 56463 cgroups_isolator.cpp:847] Recovering executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.477478 56463 cgroups_isolator.cpp:1174] Started listening for OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.478728 56463 slave.cpp:2700] Sending reconnect request to executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 at executor(1)@10.37.184.103:46181
> I0206 20:18:34.480114 56476 slave.cpp:1597] Re-registering executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.480566 56476 cgroups_isolator.cpp:717] Changing cgroup controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0206 20:18:34.481370 56476 cgroups_isolator.cpp:1007] Updated 'cpu.shares' to 2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.481827 56476 cgroups_isolator.cpp:1117] Updated 'memory.soft_limit_in_bytes' to 1GB for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> Re-registered executor on smfd-bkq-03-sr4.devel.twitter.com
> I0206 20:18:34.489497 56471 slave.cpp:1713] Cleaning up un-reregistered executors
> I0206 20:18:34.489588 56471 slave.cpp:2760] Finished recovery
> I0206 20:18:34.490048 56463 slave.cpp:508] New master detected at master@10.37.184.103:55566
> I0206 20:18:34.490257 56475 status_update_manager.cpp:162] New master detected at master@10.37.184.103:55566
> I0206 20:18:34.490357 56463 slave.cpp:533] Detecting new master
> W0206 20:18:34.490603 56480 master.cpp:1878] Slave at slave(10)@10.37.184.103:55566 (smfd-bkq-03-sr4.devel.twitter.com) is being allowed to re-register with an already in use id (2014-02-06-20:18:31-1740121354-55566-56447-0)
> I0206 20:18:34.490927 56479 slave.cpp:601] Re-registered with master master@10.37.184.103:55566
> I0206 20:18:34.491322 56461 hierarchical_allocator_process.hpp:498] Slave 2014-02-06-20:18:31-1740121354-55566-56447-0 reconnected
> I0206 20:18:34.491421 56468 slave.cpp:1312] Updating framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 pid to scheduler(10)@10.37.184.103:55566
> I0206 20:18:34.491444 56480 master.cpp:1673] Asked to kill task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.491488 56468 slave.cpp:1320] Checkpointing framework pid 'scheduler(10)@10.37.184.103:55566' to '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid'
> I0206 20:18:34.491497 56480 master.cpp:1707] Telling slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com) to kill task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.491657 56468 slave.cpp:1013] Asked to kill task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> Shutting down
> Killing process tree at pid 56712
> Killed the following process trees:
> [ 
> --- 56712 sleep 1000 
> ]
> Command terminated with signal Killed (pid: 56712)
> I0206 20:18:34.615216 56463 slave.cpp:1765] Handling status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from executor(1)@10.37.184.103:46181
> I0206 20:18:34.615556 56483 cgroups_isolator.cpp:717] Changing cgroup controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources 
> I0206 20:18:34.615624 56476 status_update_manager.cpp:314] Received status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.615701 56476 status_update_manager.hpp:342] Checkpointing UPDATE for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.706945 56476 status_update_manager.cpp:367] Forwarding status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 to master@10.37.184.103:55566
> I0206 20:18:34.707263 56476 slave.cpp:1890] Sending acknowledgement for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 to executor(1)@10.37.184.103:46181
> I0206 20:18:34.707352 56469 master.cpp:2020] Status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from slave(10)@10.37.184.103:55566
> I0206 20:18:34.707620 56469 master.hpp:429] Removing task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
> I0206 20:18:34.708348 56466 hierarchical_allocator_process.hpp:637] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on slave 2014-02-06-20:18:31-1740121354-55566-56447-0 from framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.708673 56469 status_update_manager.cpp:392] Received status update acknowledgement (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.708749 56469 status_update_manager.hpp:342] Checkpointing ACK for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.709411 56470 master.cpp:2272] Sending 1 offers to framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
> I0206 20:18:34.809782 56447 master.cpp:583] Master terminating
> I0206 20:18:34.810066 56447 master.cpp:246] Shutting down master
> I0206 20:18:34.810134 56482 slave.cpp:1965] master@10.37.184.103:55566 exited
> W0206 20:18:34.810184 56482 slave.cpp:1968] Master disconnected! Waiting for a new master to be elected
> I0206 20:18:34.810652 56447 master.cpp:289] Removing slave 2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
> I0206 20:18:34.813144 56447 slave.cpp:394] Slave terminating
> I0206 20:18:34.821583 56467 cgroups.cpp:1209] Trying to freeze cgroup /tmp/mesos_test_cgroup/mesos_test
> I0206 20:18:34.821652 56467 cgroups.cpp:1248] Successfully froze cgroup /tmp/mesos_test_cgroup/mesos_test after 1 attempts
> I0206 20:18:34.823129 56471 cgroups.cpp:1224] Trying to thaw cgroup /tmp/mesos_test_cgroup/mesos_test
> I0206 20:18:34.823247 56471 cgroups.cpp:1334] Successfully thawed /tmp/mesos_test_cgroup/mesos_test
> I0206 20:18:34.923945 56470 cgroups.cpp:1209] Trying to freeze cgroup /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
> I0206 20:18:34.924018 56470 cgroups.cpp:1248] Successfully froze cgroup /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 after 1 attempts
> I0206 20:18:34.925506 56461 cgroups.cpp:1224] Trying to thaw cgroup /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
> I0206 20:18:34.925580 56461 cgroups.cpp:1334] Successfully thawed /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
> [       OK ] SlaveRecoveryTest/1.SchedulerFailover (3408 ms)
> [----------] 1 test from SlaveRecoveryTest/1 (3409 ms total)
> [----------] Global test environment tear-down
> ../../src/tests/environment.cpp:247: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 56447 /home/vinod/mesos/build/src/.libs/lt-mesos-tests --verbose --gtest_filter=*SlaveRecoveryTest/1.SchedulerFailover* --gtest_repeat=10 
>  \--- 56671 ()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)