You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Semen Boikov (JIRA)" <ji...@apache.org> on 2016/07/29 15:03:20 UTC

[jira] [Created] (IGNITE-3606) Node sometimes fails to detect broken connection

Semen Boikov created IGNITE-3606:
------------------------------------

             Summary: Node sometimes fails to detect broken connection
                 Key: IGNITE-3606
                 URL: https://issues.apache.org/jira/browse/IGNITE-3606
             Project: Ignite
          Issue Type: Bug
          Components: general
            Reporter: Semen Boikov
            Priority: Critical
             Fix For: 1.8


Here is test reproducing issue https://github.com/rossdanderson/IgniteDeadlock.

When I run this test observe this sequence:
- server starts
- client starts
- server sends 2000 messages to client, on client node communication backpressure pauses reads
- server gets write timeout and closes socket
- for some reason client does not detect that existing connection was broken and thinks that connection is still established (most probably because reads are paused and node does not try to access connection)
- when server tries to re-connect then client sees that connection already established and rejects connection, so server constantly tries to reconnect and does not exist from reconnect loop:
{noformat}
"main" prio=6 tid=0x0000000001f4a000 nid=0x3588 waiting on condition [0x00000000021ed000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.ignite.internal.util.IgniteUtils.sleep(IgniteUtils.java:7414)
	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2055)
	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1970)
	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1936)
	at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1304)
	at org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1540)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)