You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Vladislav Pyatkov (JIRA)" <ji...@apache.org> on 2016/12/26 11:24:58 UTC
[jira] [Created] (IGNITE-4491) Commutation loss between two nodes
leads to hang whole cluster.
Vladislav Pyatkov created IGNITE-4491:
-----------------------------------------
Summary: Commutation loss between two nodes leads to hang whole cluster.
Key: IGNITE-4491
URL: https://issues.apache.org/jira/browse/IGNITE-4491
Project: Ignite
Issue Type: Bug
Affects Versions: 1.8
Reporter: Vladislav Pyatkov
Priority: Critical
Reproduction steps:
1) Start nodes:
DC1 DC2
1 (10.116.172.1) 8 (10.116.64.11)
2 (10.116.172.2) 7 (10.116.64.12)
3 (10.116.172.3) 6 (10.116.64.13)
4 (10.116.172.4) 5 (10.116.64.14)
each node have client which run in same host with server (look source in attachment).
2) Drop connection
Between 1-8,
1 (10.116.172.1) 8 (10.116.64.11)
Drop all input and output traffic
Invoke from 10.116.172.1
iptables -A INPUT -s 10.116.64.11 -j DROP
iptables -A OUTPUT -d 10.116.64.11 -j DROP
Between 4-5
4 (10.116.172.4) 5 (10.116.64.14)
Invoke from 10.116.172.4
iptables -A INPUT -s 10.116.64.14 -j DROP
iptables -A OUTPUT -d 10.116.64.14 -j DROP
3) Stop the grid, after several seconds
If you are looking into logs, you can find which node was segmented (pay attention, which clients did not segmented.), after drop traffic:
[12:04:33,914][INFO][disco-event-worker-#211%null%][GridDiscoveryManager] Topology snapshot [ver=18, servers=6, clients=8, CPUs=456, heap=68.0GB]
And all operations stopped at the same time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)