You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Timothy Chen (JIRA)" <ji...@apache.org> on 2014/06/21 02:21:25 UTC
[jira] [Commented] (MESOS-1503) Improve slave health checking to prevent rapid widespread slave removals.

    [ https://issues.apache.org/jira/browse/MESOS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039590#comment-14039590 ] 

Timothy Chen commented on MESOS-1503:
-------------------------------------

I'm thinking about the design of the single slave observer, and here is my current thoughts:

1, Have a single SlavesObserver protobuf process, holds a map of all the slaves to be pinged, a map of promises for the current ping, and current ping generation. 

It can have the following interface:

void registerSlave(SlaveID, UPID) : add a new slave to be health checked

void unregisterSlave(SlaveID) : remove slave to be health checked

void pingAllSlaves(): This sends out a ping to all slaves reigistered, which creates a promise for each slave ping and holds it. In the end it collects all the from the promises futures into one future and defer it to completePing(). It increments the generation id and sends that via message body to all slaves.

void pong(UPID from, string body): response callback from slave ping. body is the current ping generation id which the slave simply replies from the ping body. We also verify that the pong is sent for the current ping generation, as if the pong is delayed and we received an old pong we skip it. We also skip unregistered slave pongs too.

void timeout(UPID from): The timeout for the slave ping that just sets the promise to false.

void completePing(): In the end we look at all remaining futures and collect the failed ones, verify that it still registered and send them all to the master for termination. We can either have the opportunity to throttle or do more decisions based on all the failures at once. (We can also move the logic to master, haven't really know what's best yet).

One issue came in mind is that now we're sending all the pings at once, and I wonder if it can cause a burst of messages especially large amount of slaves. One way is to group slaves to be pinged in different intervals, but could be something further in the future.



> Improve slave health checking to prevent rapid widespread slave removals.
> -------------------------------------------------------------------------
>
>                 Key: MESOS-1503
>                 URL: https://issues.apache.org/jira/browse/MESOS-1503
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Benjamin Mahler
>            Assignee: Timothy Chen
>              Labels: reliability
>
> Per some discussions with [~tweingartner] and [~vinodkone].
> Currently the master uses a SlaveObserver for each registered slave. Each SlaveObserver operates independently and makes decisions about whether the slave is healthy.
> The independence of these observers means that in some very rare events (e.g. masters are partitioned from 75% of slaves), the master can very rapidly remove a large portion of the slaves in the cluster. Ideally such an event could be deemed dangerous and throttled accordingly through a more intelligent notion of overall cluster health.
> It may be nice to have a single observer that is responsible for health checking all the slaves. This will allow us to make safer decisions as to when to determine that slaves are unhealthy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)