You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/11/19 02:06:18 UTC

[jira] [Commented] (MESOS-821) SlaveRecoveryTest ReconcileKillTask failed on CI.

    [ https://issues.apache.org/jira/browse/MESOS-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826031#comment-13826031 ] 

Benjamin Mahler commented on MESOS-821:
---------------------------------------

See: https://reviews.apache.org/r/15662/

> SlaveRecoveryTest ReconcileKillTask failed on CI.
> -------------------------------------------------
>
>                 Key: MESOS-821
>                 URL: https://issues.apache.org/jira/browse/MESOS-821
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>
> [ RUN      ] SlaveRecoveryTest/0.ReconcileKillTask
> I1118 12:54:49.883435 27148 master.cpp:293] Master started on 127.0.1.1:45888
> I1118 12:54:49.883473 27148 master.cpp:308] Master ID: 201311181254-16842879-45888-27125
> I1118 12:54:49.883481 27148 master.cpp:311] Master only allowing authenticated frameworks to register!
> I1118 12:54:49.884964 27148 master.cpp:706] Elected as master!
> I1118 12:54:49.885890 27148 hierarchical_allocator_process.hpp:302] Initializing hierarchical allocator process with master : master@127.0.1.1:45888
> I1118 12:54:49.885939 27148 master.cpp:85] No whitelist given. Advertising offers for all slaves
> I1118 12:54:49.891530 27148 slave.cpp:109] Slave started on 108)@127.0.1.1:45888
> I1118 12:54:49.891607 27148 slave.cpp:209] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1118 12:54:49.891989 27148 slave.cpp:488] New master detected at master@127.0.1.1:45888
> I1118 12:54:49.892437 27145 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta'
> I1118 12:54:49.893154 27148 slave.cpp:503] Postponing registration until recovery is complete
> I1118 12:54:49.893230 27148 status_update_manager.cpp:158] New master detected at master@127.0.1.1:45888
> I1118 12:54:49.893256 27148 status_update_manager.cpp:180] Recovering status update manager
> I1118 12:54:49.894341 27148 process_isolator.cpp:317] Recovering isolator
> I1118 12:54:49.895253 27145 slave.cpp:2748] Finished recovery
> I1118 12:54:49.895390 27145 master.cpp:1232] Attempting to register slave on raring at slave(108)@127.0.1.1:45888
> I1118 12:54:49.895405 27145 master.cpp:2479] Adding slave 201311181254-16842879-45888-27125-0 at raring with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1118 12:54:49.895503 27145 slave.cpp:554] Registered with master master@127.0.1.1:45888; given slave ID 201311181254-16842879-45888-27125-0
> I1118 12:54:49.895649 27145 slave.cpp:567] Checkpointing SlaveInfo to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/slave.info'
> I1118 12:54:49.896234 27146 hierarchical_allocator_process.hpp:445] Added slave 201311181254-16842879-45888-27125-0 (raring) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I1118 12:54:49.896718 27146 hierarchical_allocator_process.hpp:708] Performed allocation for slave 201311181254-16842879-45888-27125-0 in 8938ns
> I1118 12:54:49.897578 27145 sched.cpp:199] New master at master@127.0.1.1:45888
> I1118 12:54:49.897599 27145 sched.cpp:285] Authenticating with master master@127.0.1.1:45888
> I1118 12:54:49.897686 27145 authenticatee.hpp:124] Creating new client SASL connection
> I1118 12:54:49.897799 27145 master.cpp:1700] Authenticating framework at scheduler(110)@127.0.1.1:45888
> I1118 12:54:49.897891 27145 authenticator.hpp:140] Creating new server SASL connection
> I1118 12:54:49.897989 27145 authenticatee.hpp:212] Received SASL authentication mechanisms: CRAM-MD5
> I1118 12:54:49.898006 27145 authenticatee.hpp:238] Attempting to authenticate with mechanism 'CRAM-MD5'
> I1118 12:54:49.898027 27145 authenticator.hpp:243] Received SASL authentication start
> I1118 12:54:49.898061 27145 authenticator.hpp:325] Authentication requires more steps
> I1118 12:54:49.898084 27145 authenticatee.hpp:258] Received SASL authentication step
> I1118 12:54:49.898116 27145 authenticator.hpp:271] Received SASL authentication step
> I1118 12:54:49.898131 27145 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'raring' server FQDN: 'raring' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
> I1118 12:54:49.898138 27145 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
> I1118 12:54:49.898149 27145 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I1118 12:54:49.898157 27145 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'raring' server FQDN: 'raring' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
> I1118 12:54:49.898162 27145 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I1118 12:54:49.898167 27145 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I1118 12:54:49.898177 27145 authenticator.hpp:317] Authentication success
> I1118 12:54:49.898201 27145 authenticatee.hpp:298] Authentication success
> I1118 12:54:49.898221 27145 master.cpp:1740] Successfully authenticated framework at scheduler(110)@127.0.1.1:45888
> I1118 12:54:49.898289 27145 sched.cpp:347] Successfully authenticated with master master@127.0.1.1:45888
> I1118 12:54:49.898334 27145 master.cpp:764] Received registration request from scheduler(110)@127.0.1.1:45888
> I1118 12:54:49.898434 27145 master.cpp:782] Registering framework 201311181254-16842879-45888-27125-0000 at scheduler(110)@127.0.1.1:45888
> I1118 12:54:49.901803 27147 sched.cpp:386] Framework registered with 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.901830 27147 sched.cpp:400] Scheduler::registered took 8693ns
> I1118 12:54:49.902267 27145 hierarchical_allocator_process.hpp:332] Added framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.903087 27145 hierarchical_allocator_process.hpp:752] Offering cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 to framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.904006 27146 master.hpp:419] Adding offer 201311181254-16842879-45888-27125-0 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 (raring)
> I1118 12:54:49.904053 27146 master.cpp:1666] Sending 1 offers to framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.904153 27146 sched.cpp:493] Scheduler::resourceOffers took 17813ns
> I1118 12:54:49.907963 27147 master.cpp:1992] Processing reply for offer 201311181254-16842879-45888-27125-0 on slave 201311181254-16842879-45888-27125-0 (raring) for framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.908049 27147 master.hpp:391] Adding task ee700e0f-7b4f-4df2-864a-05ddaae3d269 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 (raring)
> I1118 12:54:49.908092 27147 master.cpp:2116] Launching task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 (raring)
> I1118 12:54:49.908182 27147 master.hpp:429] Removing offer 201311181254-16842879-45888-27125-0 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 (raring)
> I1118 12:54:49.908253 27147 slave.cpp:728] Got assigned task ee700e0f-7b4f-4df2-864a-05ddaae3d269 for framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.908319 27147 slave.cpp:2876] Checkpointing FrameworkInfo to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/framework.info'
> I1118 12:54:49.908447 27147 slave.cpp:2883] Checkpointing framework pid 'scheduler(110)@127.0.1.1:45888' to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/framework.pid'
> I1118 12:54:49.908632 27147 slave.cpp:839] Launching task ee700e0f-7b4f-4df2-864a-05ddaae3d269 for framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.909664 27147 slave.cpp:3140] Checkpointing ExecutorInfo to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/executor.info'
> I1118 12:54:49.909983 27147 slave.cpp:3229] Checkpointing TaskInfo to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde/tasks/ee700e0f-7b4f-4df2-864a-05ddaae3d269/task.info'
> I1118 12:54:49.910125 27147 slave.cpp:949] Queuing task 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' for executor ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework '201311181254-16842879-45888-27125-0000
> I1118 12:54:49.910222 27147 process_isolator.cpp:100] Launching ee700e0f-7b4f-4df2-864a-05ddaae3d269 (/var/jenkins/workspace/mesos-ubuntu-13.04-gcc/src/mesos-executor) in /tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde with resources ' for framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:49.916390 27145 hierarchical_allocator_process.hpp:688] Performed allocation for 1 slaves in 13.362762ms
> I1118 12:54:49.917153 27148 slave.cpp:470] Successfully attached file '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde'
> I1118 12:54:49.921512 27147 process_isolator.cpp:163] Forked executor at 28531
> I1118 12:54:49.924103 27146 slave.cpp:2078] Monitoring executor ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 forked at pid 28531
> Killed the following process trees:
> [ 
> -+- 28529 sh -c sleep 1000 
>  \--- 28530 sleep 1000 
> ]
> Checkpointing executor's forked pid 28531 to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde/pids/forked.pid'
> Fetching resources into '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde'
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I1118 12:54:50.001929 28549 process.cpp:1561] libprocess is initialized on 127.0.1.1:44149 for 4 cpus
> I1118 12:54:50.004824 28565 exec.cpp:178] Executor started at: executor(1)@127.0.1.1:44149 with pid 28549
> I1118 12:54:50.005602 27146 slave.cpp:1411] Got registration for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.005656 27146 slave.cpp:1496] Checkpointing executor pid 'executor(1)@127.0.1.1:44149' to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde/pids/libprocess.pid'
> I1118 12:54:50.006034 27146 slave.cpp:1532] Flushing queued task ee700e0f-7b4f-4df2-864a-05ddaae3d269 for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.006652 28567 process.cpp:992] Socket closed while receiving
> I1118 12:54:50.006996 28564 exec.cpp:202] Executor registered on slave 201311181254-16842879-45888-27125-0
> Registered executor on raring
> I1118 12:54:50.008285 28564 exec.cpp:214] Executor::registered took 298819ns
> I1118 12:54:50.008563 28564 exec.cpp:289] Executor asked to run task 'ee700e0f-7b4f-4df2-864a-05ddaae3d269'
> Starting task ee700e0f-7b4f-4df2-864a-05ddaae3d269
> I1118 12:54:50.009301 28564 exec.cpp:298] Executor::launchTask took 538163ns
> sh -c 'sleep 1000'
> Forked command at 28568
> I1118 12:54:50.012308 28565 exec.cpp:521] Executor sending status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.012805 27146 slave.cpp:1745] Handling status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 from executor(1)@127.0.1.1:44149
> I1118 12:54:50.012923 27146 status_update_manager.cpp:305] Received status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.012939 27146 status_update_manager.cpp:476] Creating StatusUpdate stream for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.013097 27146 status_update_manager.hpp:337] Checkpointing UPDATE for status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.014873 27146 status_update_manager.cpp:356] Forwarding status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 to master@127.0.1.1:45888
> I1118 12:54:50.015094 27148 master.cpp:1418] Status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 from slave(108)@127.0.1.1:45888
> I1118 12:54:50.015175 27148 sched.cpp:548] Scheduler::statusUpdate took 13653ns
> I1118 12:54:50.015641 27145 slave.cpp:1864] Status update manager successfully handled status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.015661 27145 slave.cpp:1870] Sending acknowledgement for status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 to executor(1)@127.0.1.1:44149
> I1118 12:54:50.015841 27146 status_update_manager.cpp:380] Received status update acknowledgement (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.015862 27146 status_update_manager.hpp:337] Checkpointing ACK for status update TASK_RUNNING (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.016950 28567 process.cpp:992] Socket closed while receiving
> I1118 12:54:50.017689 27145 slave.cpp:1351] Status update manager successfully handled status update acknowledgement (UUID: 4969befa-4137-4bbf-beff-25f818bcf4f7) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.017819 27125 slave.cpp:395] Slave terminating
> I1118 12:54:50.017257 28566 exec.cpp:335] Executor received status update acknowledgement 4969befa-4137-4bbf-beff-25f818bcf4f7 for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.022469 27148 master.cpp:624] Slave 201311181254-16842879-45888-27125-0 (raring) disconnected
> I1118 12:54:50.022909 27148 hierarchical_allocator_process.hpp:484] Slave 201311181254-16842879-45888-27125-0 disconnected
> I1118 12:54:50.035120 27148 master.cpp:1097] Asked to kill task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.035800 27148 slave.cpp:109] Slave started on 109)@127.0.1.1:45888
> I1118 12:54:50.035884 27148 slave.cpp:209] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1118 12:54:50.036300 27148 slave.cpp:488] New master detected at master@127.0.1.1:45888
> I1118 12:54:50.036324 27148 slave.cpp:503] Postponing registration until recovery is complete
> I1118 12:54:50.036417 27148 status_update_manager.cpp:158] New master detected at master@127.0.1.1:45888
> I1118 12:54:50.036501 27148 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta'
> I1118 12:54:50.038290 27148 slave.cpp:2810] Recovering framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.039201 27148 slave.cpp:2991] Recovering executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.040416 27148 status_update_manager.cpp:180] Recovering status update manager
> I1118 12:54:50.041308 27148 status_update_manager.cpp:188] Recovering executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.042100 27148 status_update_manager.cpp:476] Creating StatusUpdate stream for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.042945 27148 status_update_manager.hpp:301] Replaying status update stream for task ee700e0f-7b4f-4df2-864a-05ddaae3d269
> I1118 12:54:50.043856 27148 process_isolator.cpp:317] Recovering isolator
> I1118 12:54:50.044662 27148 process_isolator.cpp:325] Recovering executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.040827 27146 slave.cpp:470] Successfully attached file '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde'
> I1118 12:54:50.046097 27148 slave.cpp:2688] Sending reconnect request to executor ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 at executor(1)@127.0.1.1:44149
> I1118 12:54:50.047281 28566 exec.cpp:248] Received reconnect request from slave 201311181254-16842879-45888-27125-0
> I1118 12:54:50.047894 28567 process.cpp:992] Socket closed while receiving
> I1118 12:54:50.048496 27148 slave.cpp:1577] Re-registering executor ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.049439 28566 exec.cpp:225] Executor re-registered on slave 201311181254-16842879-45888-27125-0
> I1118 12:54:50.050164 28566 exec.cpp:237] Executor::reregistered took 33741ns
> Re-registered executor on raring
> I1118 12:54:50.050742 28567 process.cpp:992] Socket closed while receiving
> Command terminated with signal Killed (pid: 28529)
> I1118 12:54:50.859169 28526 exec.cpp:521] Executor sending status update TASK_KILLED (UUID: 460a713b-acc7-41ab-a9d4-55b5ec179ac2) for task aeb755ff-249d-42d3-9771-6c8057d08ab3 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:50.861075 27149 process.cpp:992] Socket closed while receiving
> I1118 12:54:50.886586 27147 hierarchical_allocator_process.hpp:688] Performed allocation for 1 slaves in 37068ns
> I1118 12:54:50.937424 27147 monitor.cpp:193] Publishing resource usage for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework '201311181254-16842879-45888-27125-0000'
> I1118 12:54:51.059996 27145 monitor.cpp:193] Publishing resource usage for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework '201311181254-16842879-45888-27125-0000'
> I1118 12:54:51.887696 27145 hierarchical_allocator_process.hpp:688] Performed allocation for 1 slaves in 39985ns
> I1118 12:54:51.950691 27148 monitor.cpp:193] Publishing resource usage for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework '201311181254-16842879-45888-27125-0000'
> I1118 12:54:52.048954 27148 slave.cpp:1693] Cleaning up un-reregistered executors
> I1118 12:54:52.049003 27148 slave.cpp:2748] Finished recovery
> W1118 12:54:52.049218 27148 master.cpp:1276] Slave at slave(109)@127.0.1.1:45888 (raring) is being allowed to re-register with an already in use id (201311181254-16842879-45888-27125-0)
> W1118 12:54:52.049279 27148 master.cpp:2231]  Slave 201311181254-16842879-45888-27125-0 (raring) has non-terminal task ee700e0f-7b4f-4df2-864a-05ddaae3d269 that is supposed to be killed. Killing it now!
> I1118 12:54:52.049347 27148 slave.cpp:601] Re-registered with master master@127.0.1.1:45888
> I1118 12:54:52.049371 27148 slave.cpp:1000] Asked to kill task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:52.049571 27147 hierarchical_allocator_process.hpp:498] Slave 201311181254-16842879-45888-27125-0 reconnected
> I1118 12:54:52.050099 27148 slave.cpp:1297] Updating framework 201311181254-16842879-45888-27125-0000 pid to scheduler(110)@127.0.1.1:45888
> I1118 12:54:52.050173 27148 slave.cpp:1305] Checkpointing framework pid 'scheduler(110)@127.0.1.1:45888' to '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/framework.pid'
> I1118 12:54:52.050499 28566 exec.cpp:309] Executor asked to kill task 'ee700e0f-7b4f-4df2-864a-05ddaae3d269'
> I1118 12:54:52.050529 28566 exec.cpp:318] Executor::killTask took 11476ns
> Shutting down
> Killing process tree at pid 28568
> I1118 12:54:52.055577 28567 process.cpp:992] Socket closed while receiving
> I1118 12:54:52.083104 27148 monitor.cpp:193] Publishing resource usage for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework '201311181254-16842879-45888-27125-0000'
> Killed the following process trees:
> [ 
> -+- 28568 sh -c sleep 1000 
>  \--- 28569 sleep 1000 
> ]
> I1118 12:54:52.888473 27148 hierarchical_allocator_process.hpp:688] Performed allocation for 1 slaves in 39714ns
> I1118 12:54:52.963883 27146 monitor.cpp:193] Publishing resource usage for executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework '201311181254-16842879-45888-27125-0000'
> Command terminated with signal Killed (pid: 28568)
> I1118 12:54:53.005944 28563 exec.cpp:521] Executor sending status update TASK_KILLED (UUID: bcdd77aa-69dc-462b-9284-bf9852d6ac0f) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.007668 27149 process.cpp:992] Socket closed while receiving
> I1118 12:54:53.038023 27146 process_isolator.cpp:482] Telling slave of terminated executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.039077 27147 slave.cpp:2134] Executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000 has exited with status 0
> I1118 12:54:53.039885 27147 slave.cpp:1745] Handling status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 from @0.0.0.0:0
> I1118 12:54:53.039916 27147 slave.cpp:3185] Terminating task ee700e0f-7b4f-4df2-864a-05ddaae3d269
> I1118 12:54:53.040099 27147 status_update_manager.cpp:305] Received status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.040120 27147 status_update_manager.hpp:337] Checkpointing UPDATE for status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.042255 27147 status_update_manager.cpp:356] Forwarding status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 to master@127.0.1.1:45888
> I1118 12:54:53.042903 27145 master.cpp:1418] Status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000 from slave(109)@127.0.1.1:45888
> I1118 12:54:53.042963 27145 master.hpp:409] Removing task ee700e0f-7b4f-4df2-864a-05ddaae3d269 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 201311181254-16842879-45888-27125-0 (raring)
> I1118 12:54:53.043138 27145 sched.cpp:548] Scheduler::statusUpdate took 17058ns
> I1118 12:54:53.043256 27145 hierarchical_allocator_process.hpp:637] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on slave 201311181254-16842879-45888-27125-0 from framework 201311181254-16842879-45888-27125-0000
> tests/slave_recovery_tests.cpp:1769: Failure
> Value of: status.get().state()
>   Actual: TASK_FAILED
> Expected: TASK_KILLED
> I1118 12:54:53.044571 27147 status_update_manager.cpp:380] Received status update acknowledgement (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.044596 27147 status_update_manager.hpp:337] Checkpointing ACK for status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.045207 27145 slave.cpp:1864] Status update manager successfully handled status update TASK_FAILED (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.047080 27147 status_update_manager.cpp:507] Cleaning up status update stream for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.047162 27147 slave.cpp:1351] Status update manager successfully handled status update acknowledgement (UUID: 84c81011-4725-4c98-a9b7-651ddf6cf3ca) for task ee700e0f-7b4f-4df2-864a-05ddaae3d269 of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.047178 27147 slave.cpp:3208] Completing task ee700e0f-7b4f-4df2-864a-05ddaae3d269
> I1118 12:54:53.047200 27147 slave.cpp:2269] Cleaning up executor 'ee700e0f-7b4f-4df2-864a-05ddaae3d269' of framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.047395 27147 slave.cpp:2340] Cleaning up framework 201311181254-16842879-45888-27125-0000
> I1118 12:54:53.047477 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde' for gc 6.99999945260444days in the future
> I1118 12:54:53.047526 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269' for gc 6.99999945214222days in the future
> I1118 12:54:53.047555 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269/runs/b093d99b-f37e-4254-86dd-c59820c49fde' for gc 6.99999945184days in the future
> I1118 12:54:53.047581 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000/executors/ee700e0f-7b4f-4df2-864a-05ddaae3d269' for gc 6.99999945159704days in the future
> I1118 12:54:53.047605 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000' for gc 6.99999945107852days in the future
> I1118 12:54:53.047639 27147 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_ReconcileKillTask_iRbzm0/meta/slaves/201311181254-16842879-45888-27125-0/frameworks/201311181254-16842879-45888-27125-0000' for gc 6.99999945080889days in the future
> I1118 12:54:53.047667 27147 status_update_manager.cpp:267] Closing status update streams for framework 201311181254-16842879-45888-27125-0000
> W1118 12:54:53.057777 27146 process_isolator.cpp:268] Failed to kill the process tree rooted at pid 28531: Failed to find process 28531



--
This message was sent by Atlassian JIRA
(v6.1#6144)