You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Denis Magda (JIRA)" <ji...@apache.org> on 2015/08/14 15:36:46 UTC

[jira] [Comment Edited] (IGNITE-1241) Server node prints out failure detection warning if a client node connected

    [ https://issues.apache.org/jira/browse/IGNITE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697004#comment-14697004 ] 

Denis Magda edited comment on IGNITE-1241 at 8/14/15 1:36 PM:
--------------------------------------------------------------

This warning is printed out when a local server node is considered to be disconnected from the ring - it neither receives message nor sends to the next node.

In the buggy configuration (one server and one client nodes) the server periodically received messages from the client node but unable to send connection check messages to the next node cause there was no any. This affected failure detection timeout implementation logic.

As a fix, the failure should be detected and reported only when there are remote server nodes in a topology and a local node seems to be disconnected from them. To support this {{TcpDiscovery.hasRemoteServerNodes()}} method was implemented and used by failure timeout logic.


was (Author: dmagda):
This warning is printed out when a local server node is considered to be disconnected from the ring - it neither receives message nor sends to the next node.

In the buggy configuration (one server and one client nodes) the server periodically received messages from the client node but unable to send connection check messages to the next node cause there was no any. This affected failure detection timeout implementation logic.

As a fix, the failure should be detected and reported only when there are remote server nodes in a topology and a local node seems to be disconnected from them. To support this "TcpDiscovery.hasRemoteServerNodes()" method was implemented and used by failure timeout logic.

> Server node prints out failure detection warning if a client node connected
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-1241
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1241
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: ignite-1.4
>            Reporter: Sergey Kozlov
>            Assignee: Denis Magda
>            Priority: Critical
>             Fix For: ignite-1.4
>
>
> 1. Start 1 server node.
> 2. Start 1 client node.
> 3. Server node prints out following message in a few seconds after topology update:
> {noformat}
> [16:57:49,376][INFO][disco-event-worker-#45%null%][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=531641e9-a279-41d4-b2ad-75bc
> ecb1f8b3, addrs=[0:0:0:0:0:0:0:1, 10.0.0.9, 127.0.0.1, 192.168.222.100, 2001:0:5ef5:79fd:34d0:3c29:f5ff:fff6], sockAddrs=[rr/192.168.222.100:0, /0:0:0
> :0:0:0:0:1:0, rr/192.168.222.100:0, /10.0.0.9:0, rr/192.168.222.100:0, /127.0.0.1:0, /192.168.222.100:0, /2001:0:5ef5:79fd:34d0:3c29:f5ff:fff6:0], dis
> cPort=0, order=2, intOrder=2, lastDataReceivedTime=1439387869353, loc=false, ver=1.4.1#20150812-sha1:d5986c26, isClient=true]
> [16:57:49,381][INFO][disco-event-worker-#45%null%][GridDiscoveryManager] Topology snapshot [ver=2, servers=1, clients=1, CPUs=8, heap=2.0GB]
> [16:57:59,362][INFO][tcp-disco-msg-worker-#2%null][TcpDiscoverySpi] Local node seems to be disconnected from topology (failure detection timeout is re
> ached): [failureDetectionTimeout=10000, connCheckFreq=3333]
> [16:57:59,464][INFO][tcp-disco-msg-worker-#2%null][TcpDiscoverySpi] Local node seems to be disconnected from topology (failure detection timeout is re
> ached): [failureDetectionTimeout=10000, connCheckFreq=3333]
> [16:58:01,464][INFO][tcp-disco-msg-worker-#2%null][TcpDiscoverySpi] Local node seems to be disconnected from topology (failure detection timeout is re
> ached): [failureDetectionTimeout=10000, connCheckFreq=3333]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)