You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/11/12 02:54:17 UTC

[jira] [Updated] (MESOS-808) The scheduler driver should queue messages when disconnected.

     [ https://issues.apache.org/jira/browse/MESOS-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Mahler updated MESOS-808:
----------------------------------

    Description: 
Currently when schedulers try to take an action while the driver is disconnected (i.e. a call to SchedulerDriver::disconnected has occurred), the driver will drop the request.

In the case of launching a task, we'll reply with TASK_LOST directly in the driver. However, with things like killTask, we simply drop the kill task request.

This behavior seems a little unfriendly for schedulers, as they need to be concerned about queueing any operations until Scheduler::connected is called. We should consider queuing in the driver instead.

The implementation here can consist of a queue<Message> holding the messages that were constructed while !connected. Once we re-connect, we simply run through this queue sending all messages.

However, without state in the driver, schedulers will have to live with the possibility of dropped messages anyway (i.e. if they fail while disconnected, any messages will be lost).

Therefore, an alternative here is possible when we add a v1 API. If we return a Future or other form of status we can indicate whether the message was sent. This is definitely simpler and more reliable than queueing.

  was:
Currently when schedulers try to take an action while the driver is disconnected (i.e. a call to SchedulerDriver::disconnected has occurred), the driver will drop the request.

In the case of launching a task, we'll reply with TASK_LOST directly in the driver. However, with things like killTask, we simply drop the kill task request.

This behavior seems a little unfriendly for schedulers, as they need to be concerned about queueing any operations until Scheduler::connected is called. We should consider queuing in the driver instead.

The implementation here can consist of a queue<Message> holding the messages that were constructed while !connected. Once we re-connect, we simply run through this queue sending all messages.

However, without state in the driver, schedulers will have to live with the possibility of dropped messages anyway (i.e. if they fail while disconnected, any messages will be lost).


> The scheduler driver should queue messages when disconnected.
> -------------------------------------------------------------
>
>                 Key: MESOS-808
>                 URL: https://issues.apache.org/jira/browse/MESOS-808
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Benjamin Mahler
>
> Currently when schedulers try to take an action while the driver is disconnected (i.e. a call to SchedulerDriver::disconnected has occurred), the driver will drop the request.
> In the case of launching a task, we'll reply with TASK_LOST directly in the driver. However, with things like killTask, we simply drop the kill task request.
> This behavior seems a little unfriendly for schedulers, as they need to be concerned about queueing any operations until Scheduler::connected is called. We should consider queuing in the driver instead.
> The implementation here can consist of a queue<Message> holding the messages that were constructed while !connected. Once we re-connect, we simply run through this queue sending all messages.
> However, without state in the driver, schedulers will have to live with the possibility of dropped messages anyway (i.e. if they fail while disconnected, any messages will be lost).
> Therefore, an alternative here is possible when we add a v1 API. If we return a Future or other form of status we can indicate whether the message was sent. This is definitely simpler and more reliable than queueing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)