You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "haosdent (JIRA)" <ji...@apache.org> on 2016/04/13 05:43:25 UTC

[jira] [Updated] (MESOS-540) Executor health checking.

     [ https://issues.apache.org/jira/browse/MESOS-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

haosdent updated MESOS-540:
---------------------------
    Labels: health-check twitter  (was: twitter)

> Executor health checking.
> -------------------------
>
>                 Key: MESOS-540
>                 URL: https://issues.apache.org/jira/browse/MESOS-540
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Benjamin Mahler
>              Labels: health-check, twitter
>
> We currently do not health check running executors.
> At Twitter, this has led to out-of-band health checking of executors for an internal framework.
> For the Storm framework, this has led to out-of-band health checking via ZooKeeper. Health checking would allow Storm to use finer grained executors for better isolation.
> This also helps the Hadoop and Jenkins frameworks as well should health checking be desired.
> As for implementation, I would propose adding a call on the Executor interface:
> /**
>  * Invoked by the ExecutorDriver to determine the health of the executor.
>  * When this function returns, the Executor is considered healthy.
>  */
> void heartbeat(ExecutorDriver* driver) = 0;
> The driver can then heartbeat periodically and kill when the Executor is not responding to heartbeats. The driver should also detect the executor deadlocking on any of the other callbacks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)