You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Tom Arnfeld (JIRA)" <ji...@apache.org> on 2014/09/19 16:47:35 UTC

[jira] [Comment Edited] (MESOS-1812) Queued tasks are not actually launched in the order they were queued

    [ https://issues.apache.org/jira/browse/MESOS-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140671#comment-14140671 ] 

Tom Arnfeld edited comment on MESOS-1812 at 9/19/14 2:47 PM:
-------------------------------------------------------------

I think there are use cases for it. For example, the modifications I am making to the hadoop framework.

Ultimately I am trying to control how long an Executor process lives for, and be able to trigger it to commit suicide, from the framework. Framework/Executor messages are currently not a reliable form of communication over mesos (as far as I know) and after my tasks are done I need the executor to stay around for a specific amount of time.

Perhaps what I really need here is some kind of {{shutdownExecutor}} driver call.


was (Author: tarnfeld):
I think there are use cases for it. For example, the modifications I am making to the hadoop framework.

Ultimately I am trying to control how long an Executor process lives for, and be able to trigger it to commit suicide. Framework messages are currently not a reliable form of communication over mesos (as far as I know) and after my tasks are done I need the executor to stay around for a specific amount of time.

Perhaps what I really need here is some kind of {{shutdownExecutor}} driver call.

> Queued tasks are not actually launched in the order they were queued
> --------------------------------------------------------------------
>
>                 Key: MESOS-1812
>                 URL: https://issues.apache.org/jira/browse/MESOS-1812
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>            Reporter: Tom Arnfeld
>
> Even though tasks are assigned and queued in the order in which they are launched (e.g multiple tasks in reply to one offer), due to timing issues with the futures, this can sometimes break the causality and end up not being launched in order.
> Example trace from a slave... In this example the Task_Tracker_10 task should be launched before slots_Task_Tracker_10.
> {code}
> I0918 02:10:50.371445 17072 slave.cpp:933] Got assigned task Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372110 17072 slave.cpp:933] Got assigned task slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372172 17073 gc.cpp:84] Unscheduling '/mnt/mesos-slave/slaves/20140915-112519-3171422218-5050-5016-6/frameworks/20140916-233111-3171422218-5050-14295-0015' from gc
> I0918 02:10:50.375018 17072 slave.cpp:1043] Launching task slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386282 17072 slave.cpp:1153] Queuing task 'slots_Task_Tracker_10' for executor executor_Task_Tracker_10 of framework '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386312 17070 mesos_containerizer.cpp:537] Starting container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' for executor 'executor_Task_Tracker_10' of framework '20140916-233111-3171422218-5050-14295-0015'
> I0918 02:10:50.388942 17072 slave.cpp:1043] Launching task Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.406277 17070 launcher.cpp:117] Forked child with pid '817' for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:10:50.406563 17072 slave.cpp:1153] Queuing task 'Task_Tracker_10' for executor executor_Task_Tracker_10 of framework '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.408499 17069 mesos_containerizer.cpp:647] Fetching URIs for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' using command '/usr/local/libexec/mesos/mesos-fetcher'
> I0918 02:11:11.650687 17071 slave.cpp:2873] Current usage 17.34%. Max allowed age: 5.086371210668750days
> I0918 02:11:16.590270 17075 slave.cpp:2355] Monitoring executor 'executor_Task_Tracker_10' of framework '20140916-233111-3171422218-5050-14295-0015' in container '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:17.701015 17070 slave.cpp:1664] Got registration for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.701897 17070 slave.cpp:1783] Flushing queued task slots_Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.702350 17070 slave.cpp:1783] Flushing queued task Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:18.588388 17070 mesos_containerizer.cpp:1112] Executor for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' has exited
> I0918 02:11:18.588665 17070 mesos_containerizer.cpp:996] Destroying container '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:18.599234 17072 slave.cpp:2413] Executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015 has exited with status 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)