You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2017/01/19 17:16:26 UTC

[jira] [Commented] (MESOS-6952) Mesos task state was stuck in staging even after executor terminated

    [ https://issues.apache.org/jira/browse/MESOS-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830280#comment-15830280 ] 

Vinod Kone commented on MESOS-6952:
-----------------------------------

Can you paste the corresponding master logs?

> Mesos task state was stuck in staging even after executor terminated
> --------------------------------------------------------------------
>
>                 Key: MESOS-6952
>                 URL: https://issues.apache.org/jira/browse/MESOS-6952
>             Project: Mesos
>          Issue Type: Bug
>          Components: executor
>    Affects Versions: 0.28.2
>         Environment: ubuntu 14.04
>            Reporter: Sathish Kumar
>
> Task is stuck at staging almost 6hours in stage even after slave executor is terminated.
> Mesos master keeps the task state in staging state. Since the task is stuck at staging framework have not got the update from mesos-master
>  The issue got fixed after slave restart.
> I can see in the slave logs Asked to run task ' which is terminating/terminated
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097193 107774 slave.cpp:1361] Got assigned task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097453 107774 slave.cpp:1480] Launching task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is terminating/terminated
> full Log of slave
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.066277 107763 slave.cpp:3012] Handling status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.134692 107766 status_update_manager.cpp:320] Received status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.134753 107766 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.142010 107767 slave.cpp:3410] Forwarding the update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.142119 107767 slave.cpp:3320] Sending acknowledgement for status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.226682 107761 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.226759 107761 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.858510 107759 slave.cpp:1361] Got assigned task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.858762 107759 slave.cpp:1480] Launching task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.859004 107759 slave.cpp:1711] Queuing task 'ct:1484816820000:0:foocare_zendesk_round_robin:' for executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:15.939483 107759 slave.cpp:1863] Sending queued task 'ct:1484816820000:0:foocare_zendesk_round_robin:' to executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:16.141394 107762 slave.cpp:3871] Executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 exited with status 0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:16.141451 107762 slave.cpp:3012] Handling status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:16.141849 107762 status_update_manager.cpp:320] Received status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:16.141989 107762 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:16.147343 107766 slave.cpp:3410] Forwarding the update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.089175 107759 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097193 107774 slave.cpp:1361] Got assigned task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097453 107774 slave.cpp:1480] Launching task ct:1484816820000:0:foocare_zendesk_round_robin: for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is terminating/terminated
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097568 107774 slave.cpp:3012] Handling status update TASK_LOST (UUID: b999fb64-34f0-496d-be19-f5a7f998230e) for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097633 107774 slave.cpp:3975] Cleaning up executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097790 107772 gc.cpp:55] Scheduling '/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b' for gc 6.99999886874074days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097836 107772 gc.cpp:55] Scheduling '/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:' for gc 6.99999886832296days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097869 107772 gc.cpp:55] Scheduling '/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b' for gc 6.99999886819259days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119 14:42:17.097888 107772 gc.cpp:55] Scheduling '/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:' for gc 6.99999886809185days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.WARNING.20161004-154318.107733:W0119 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is terminating/terminated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)