You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Hindman (JIRA)" <ji...@apache.org> on 2015/09/29 01:24:05 UTC

[jira] [Updated] (MESOS-3544) Support task and/or executor restart on failure.

     [ https://issues.apache.org/jira/browse/MESOS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Hindman updated MESOS-3544:
------------------------------------
    Issue Type: Epic  (was: Bug)

> Support task and/or executor restart on failure.
> ------------------------------------------------
>
>                 Key: MESOS-3544
>                 URL: https://issues.apache.org/jira/browse/MESOS-3544
>             Project: Mesos
>          Issue Type: Epic
>          Components: HTTP API, master, slave
>            Reporter: Benjamin Hindman
>
> In certain instances it might be preferable to restart a task/executor after it fails (i.e., non-zero exit code) rather than going through an entire status update -> offer -> accept (launch) cycle to restart the task/executor on the same machine. This is especially true if the resources are reserved (dynamically or statically).
> Of course, we still want to highlight the restart to the framework, so introducing something like TASK_RESTARTED might be necessary (not sure what the analog would be for executors).
> Finally, if the task/executor has a bug we don't want to sit in an infinite loop, so we'll likely want to introduce this functionality in such a way as to limit the total restart attempts (or force a framework to have the proper authority to restart forever).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)