You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Roman Puchkovskiy (Jira)" <ji...@apache.org> on 2022/12/15 14:32:00 UTC
[jira] [Created] (IGNITE-18423) It is possible for a node to receive a message before the sender appears in topology
Roman Puchkovskiy created IGNITE-18423:
------------------------------------------
Summary: It is possible for a node to receive a message before the sender appears in topology
Key: IGNITE-18423
URL: https://issues.apache.org/jira/browse/IGNITE-18423
Project: Ignite
Issue Type: Bug
Components: networking
Reporter: Roman Puchkovskiy
Fix For: 3.0.0-beta2
The scenario is as follows:
# Two nodes (A and B) know about each other (i.e. each of them has the other node in its physical topology)
# A network partition happens, so the nodes disappear from their counterparts' topologies
# The partition disappears, and A starts seeing B (i.e. B gets added to A's view of the topology)
# A sends a message to B using invoke
# B fails to process the message (at least, it fails to send a response) because A is not yet seen by B (A is not in B's view of the topology)
# B starts seeing A
The problem is that a channel might be established before the other side node is added to the topology.
It seems that there are two paths we could follow:
# We don't use Scalecube's transport for sending messages; instead, we only use its discovery capabilities, and Scalecube uses our transport. As a result, our event listener (notified about a node being added to the topology) might be out of sync with the actual connection event. It is probable that Scalecube guarantees these events to be in sync, if its native transport is used. We could investigate how it does it (if it does) and see whether we can do the same.
# Alternatively, a node establishing a connection might present itself (send the corresponding {{{}ClusterNode{}}}) as a part of a handshake procedure, and we could immediately simulate the corresponding event on the receiving side, hence closing the gap.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)