You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Roman Puchkovskiy (Jira)" <ji...@apache.org> on 2023/02/10 15:17:00 UTC

[jira] [Updated] (IGNITE-18772) Design mechanisms for messaging consistency

     [ https://issues.apache.org/jira/browse/IGNITE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Puchkovskiy updated IGNITE-18772:
---------------------------------------
    Summary: Design mechanisms for messaging consistency  (was: Design mechanisms for network messaging consistency)

> Design mechanisms for messaging consistency
> -------------------------------------------
>
>                 Key: IGNITE-18772
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18772
>             Project: Ignite
>          Issue Type: New Feature
>          Components: networking
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> We have a use case where node A asks node B to notify node A when some event on node B occurs. This requires two round trips: first RT (A invokes B) installs an event listener on B, and second round trip (B makes a strong send to A) notifies A about the event.
> To account for possible topology instability, code at node A subscribes to onDisappeared(B), same does code at node B (but to onDisappeared(A)).
> A timeout might be installed on node A.
> Outcomes are:
>  # If B is not in the topology for A, invoke future fails right away (B knows nothing about invocation, there is no request)
>  # If A loses B from sight before invoke response is delivered to A, invoke future fails at A, and B eventually deregisters the listener
>  # If invocation is ok, but nodes lose each other from sight before the event happens, node A stops waiting and node B deregisters the listener
>  # Same happens if callback times out
>  # If invocation is ok and event happens while nodes see each other, callback is delivered from B to A (with best effort guarantees, with retries till delivered or timed out or nodes lose each other of sight)
> The outcome must be consistent. That is, it cannot happen that one node acted as if it thought that another node disappeared, but another node acted as if first node was available.
>  # Relation 'X sees Y' must be symmetric (in an eventual sense)
>  # If node X currently does not see node Y, it cannot accept messages from it
> We could use the following invariant: if a node has disappeared from the topology, it cannot appear there again with same identity (IGNITE-18712 might help on the physical topology level).
> Things that should be carefully considered:
>  # Nodes might have different views of the topology: 'X sees Y' might not be symmetric at some points in time
>  # What kinds of topologies are concerned? Should this work over physical topology, logical one, or over any of them (on the user's discretion)?
>  # How do we deal with non-transient failures (like an NPE) different from failures caused by node disappearance? Do we just keep retrying until timeout is triggered, or we crash the node if some unexpected failure occurs, or...?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)