You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by npordash <ni...@gmail.com> on 2018/04/27 00:27:27 UTC

Node failure handling semantics

Hi,

I was wondering if there is any additional documentation on how Ignite
internally handles node failures? The section in the documentation kind of
skims over this too quickly[1].

I specifically have the following inquiries:

1) Does each node in the ring send heartbeats to all other nodes in the ring
or only to it's neighbor(s)?
2) If each node is sending heartbeats to all other nodes then how is
ejection of a node from the cluster coordinated? It could be possible that
some nodes in the cluster detect the failure while others don't.
3) If a node is only sending heartbeats to neighbor(s) for failure detection
then how would cluster ejection be determined given that you could have
healthy nodes in the ring between un-healthy nodes?

I understand there are split-brain resolvers, but given that there are no
out-of-the-box implementations in the OSS version I'm curious what the
expected behavior is here without one.

Thanks!
-Nick

[1]
https://apacheignite.readme.io/docs/cluster-config#section-failure-detection-timeout



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node failure handling semantics

Posted by vkulichenko <va...@gmail.com>.

Nick,

If B is unhealthy, C will not be able to send a heartbeat message to it.
After a timeout, C will consider B as failed and will connect to A. Along
with this connection it will send a NODE_FAILED message that will go through
all the nodes.

Once B is back again, it will try to send a message to A, but will get a
response that it was already removed from topology (A knows about this as it
got information from C). This will trigger local NODE_SEGMENTED event on B.

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node failure handling semantics

Posted by npordash <ni...@gmail.com>.

Thanks Val

I'd like to make sure I understand this correctly. Let's say we have a ring
of nodes A <- B <- C <- D <- A.

If B is unhealthy then C won't see a heartbeat within the configured failure
detection time and will then proceed to connect to A. When this happens, how
is B's ejection coordinated across the cluster? Or does it even need to be?
I know at some point all nodes will log that B failed.

Now let's say B has been ejected but is now recovered (f.e. network
restored, GC pause passed, etc). How does it know it's been ejected? I think
at this point it will now think A has failed because it hasn't received a
heartbeat for it since B itself was unavailable and it may not be aware of
that and may try to start an ejection process for node A. How is this
situation handled?

-Nick



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node failure handling semantics

Posted by vkulichenko <va...@gmail.com>.

Nick,

Here are some comments on your questions:

1. Heartbeat is always sent to the next node in the ring. That's the whole
point of the ring architecture vs. peer-to-peer.
2. That's not possible, because each discovery message is sent across the
ring, and the ordering of those messages is guaranteed to be consistent
across nodes.
3. In short, when one of the node in unhealthy, healthy node behind detects
it (can't send heartbeat) and then tries to connect to node that follows the
unhealthy one. If successful, that effectively creates a new ring and new
topology without the unhealthy node.

In split-brain scenario you will have two independent clusters (two
segments).

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/