You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2015/06/17 02:29:02 UTC

[jira] [Resolved] (MESOS-2246) Improve slave health-checking

     [ https://issues.apache.org/jira/browse/MESOS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kone resolved MESOS-2246.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 0.22.0
         Assignee: Vinod Kone

> Improve slave health-checking
> -----------------------------
>
>                 Key: MESOS-2246
>                 URL: https://issues.apache.org/jira/browse/MESOS-2246
>             Project: Mesos
>          Issue Type: Epic
>          Components: master, slave
>            Reporter: Dominic Hamon
>            Assignee: Vinod Kone
>             Fix For: 0.22.0
>
>
> In the event of a network partition, or other systemic issues, we may see  widespread slave removal. There are several approaches we can take to mitigate this issue including, but not limited to:
> . rate limit the slave removal
> . change how we do health checking to not rely on a single point of view
> . work with frameworks to determine SLA of running services before removing the slave
> . manual control to allow operator intervention 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)