You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Yakov Zhdanov (JIRA)" <ji...@apache.org> on 2016/12/29 10:26:58 UTC

[jira] [Updated] (IGNITE-4501) Improvement of connection in a cluster of new node

     [ https://issues.apache.org/jira/browse/IGNITE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yakov Zhdanov updated IGNITE-4501:
----------------------------------
    Description: 
h3. Main description:
Cluster nodes connect a ring.
For example: we have 6 nodes: A, B, C, D, E, F. 
They can connect a ring in any possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, etc.
If some node leaves topology, adjacent nodes must reconnect. 
If nodes A, B, C are in same physical place, nodes D, E, F are in other place, and places lost connect each other, we will have many ways of reconnections.
At best case, if we had a ring: A-B-CxD-E-FxA ('x' means disconnect) -- then we have only one reconnect (C
will be connected to A or F will be connected to D -- depends on what part of the cluster was alive.
Also, if we had a not ring: AxFxBxExCxDxA -- then we have a lot of reconnections (A to B, B to C, C to A -- in general n/2 reconnections, where n -- number of nodes). 
h3. Approach:
It is necessary to develop approach of node insertion to the correct place for creation of the correct ring-topology.
h3. Solutions:
Main idea is a sorting according to latency.
* group nodes in arcs on an ARC_ID. (manualy?)
* implement NodeComparator (nodes on the same host : nodes on the same subnet : other nodes). We will use it when we connect a new node.
* [dev list thread|http://mail-archives.apache.org/mod_mbox/ignite-dev/201612.mbox/%3CCAN+WSNyWYXSXEBpGErVt72zTgi2pTQzUWLv8JY=Ke83-5-Rh9g@mail.gmail.com%3E]

Update Dec, 29 Yakov Zhdanov:
# introduce CLUSTER_REGION_ID node attribute. This can be done by adding public static final constant to TcpDiscoverySpi.
# Alter org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing#nextNode(java.util.Collection<org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode>) to order basing on per node attribute value
# Node comparison should be stable and consistent. E.g. if CLUSTER_REGION_IDs are equal then we should compare nodes' IDs. This way we have consistent order on all nodes in topology.
# Also nextNode() has to group nodes on same host and in same subnet. This can be postponed and implemented after we have other points done.


  was:
h3. Main description:
Cluster nodes connect a ring.
For example: we have 6 nodes: A, B, C, D, E, F. 
They can connect a ring in any possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, etc.
If some node leaves topology, adjacent nodes must reconnect. 
If nodes A, B, C are in same physical place, nodes D, E, F are in other place, and places lost connect each other, we will have many ways of reconnections.
At best case, if we had a ring: A-B-CxD-E-FxA ('x' means disconnect) -- then we have only one reconnect (C
will be connected to A or F will be connected to D -- depends on what part of the cluster was alive.
Also, if we had a not ring: AxFxBxExCxDxA -- then we have a lot of reconnections (A to B, B to C, C to A -- in general n/2 reconnections, where n -- number of nodes). 
h3. Approach:
It is necessary to develop approach of node insertion to the correct place for creation of the correct ring-topology.
h3. Solutions:
Main idea is a sorting according to latency.
* group nodes in arcs on an ARC_ID. (manualy?)
* implement NodeComparator (nodes on the same host : nodes on the same subnet : other nodes). We will use it when we connect a new node.
* [dev list thread|http://mail-archives.apache.org/mod_mbox/ignite-dev/201612.mbox/%3CCAN+WSNyWYXSXEBpGErVt72zTgi2pTQzUWLv8JY=Ke83-5-Rh9g@mail.gmail.com%3E]


> Improvement of connection in a cluster of new node
> --------------------------------------------------
>
>                 Key: IGNITE-4501
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4501
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vyacheslav Daradur
>            Assignee: Alexander Menshikov
>
> h3. Main description:
> Cluster nodes connect a ring.
> For example: we have 6 nodes: A, B, C, D, E, F. 
> They can connect a ring in any possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, etc.
> If some node leaves topology, adjacent nodes must reconnect. 
> If nodes A, B, C are in same physical place, nodes D, E, F are in other place, and places lost connect each other, we will have many ways of reconnections.
> At best case, if we had a ring: A-B-CxD-E-FxA ('x' means disconnect) -- then we have only one reconnect (C
> will be connected to A or F will be connected to D -- depends on what part of the cluster was alive.
> Also, if we had a not ring: AxFxBxExCxDxA -- then we have a lot of reconnections (A to B, B to C, C to A -- in general n/2 reconnections, where n -- number of nodes). 
> h3. Approach:
> It is necessary to develop approach of node insertion to the correct place for creation of the correct ring-topology.
> h3. Solutions:
> Main idea is a sorting according to latency.
> * group nodes in arcs on an ARC_ID. (manualy?)
> * implement NodeComparator (nodes on the same host : nodes on the same subnet : other nodes). We will use it when we connect a new node.
> * [dev list thread|http://mail-archives.apache.org/mod_mbox/ignite-dev/201612.mbox/%3CCAN+WSNyWYXSXEBpGErVt72zTgi2pTQzUWLv8JY=Ke83-5-Rh9g@mail.gmail.com%3E]
> Update Dec, 29 Yakov Zhdanov:
> # introduce CLUSTER_REGION_ID node attribute. This can be done by adding public static final constant to TcpDiscoverySpi.
> # Alter org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing#nextNode(java.util.Collection<org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode>) to order basing on per node attribute value
> # Node comparison should be stable and consistent. E.g. if CLUSTER_REGION_IDs are equal then we should compare nodes' IDs. This way we have consistent order on all nodes in topology.
> # Also nextNode() has to group nodes on same host and in same subnet. This can be postponed and implemented after we have other points done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)