You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Lukas Lentner <ko...@lukaslentner.de> on 2017/01/24 18:47:11 UTC

Connection problems

Hi,

I have the problem that 2 nodes all the time loose their connection. What can I do to make it more durable?

My Log:

[ignite-update-notifier-timer] INFO org.apache.ignite.internal.processors.cluster.GridUpdateNotifier - Update status is not available.
[tcp-disco-msg-worker-#2%null%] WARN org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Timed out waiting for message delivery receipt (most probably, the reason is in long GC pauses on remote node; consider tuning GC and increasing 'ackTimeout' configuration property). Will retry to send message with increased timeout. Current timeout: 10000.
[tcp-disco-msg-worker-#2%null%] WARN org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Failed to send message to next node [msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode [id=663d9d58-49ad-4993-9495-2304120958c3, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.6], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 86f7fe7944ed/172.17.0.6:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1485283294557, loc=true, ver=1.7.0#20160801-sha1:383273e3, isClient=false], failedNodeId=null, status=0, super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=18e26c1d951-663d9d58-49ad-4993-9495-2304120958c3, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]], next=TcpDiscoveryNode [id=0392e735-de23-475d-bae3-dfcb2512d586, addrs=[127.0.0.1, 172.17.0.5], sockAddrs=[483aeb5c0a2b/172.17.0.5:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1485283222882, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], errMsg=Failed to send message to next node [msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode [id=663d9d58-49ad-4993-9495-2304120958c3, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.6], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 86f7fe7944ed/172.17.0.6:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1485283294557, loc=true, ver=1.7.0#20160801-sha1:383273e3, isClient=false], failedNodeId=null, status=0, super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=18e26c1d951-663d9d58-49ad-4993-9495-2304120958c3, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]], next=ClusterNode [id=0392e735-de23-475d-bae3-dfcb2512d586, order=1, addr=[127.0.0.1, 172.17.0.5], daemon=false]]]
[tcp-disco-msg-worker-#2%null%] WARN org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Local node has detected failed nodes and started cluster-wide procedure. To speed up failure detection please see 'Failure Detection' section under javadoc for 'TcpDiscoverySpi'
[disco-event-worker-#43%null%] WARN org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Node FAILED: TcpDiscoveryNode [id=0392e735-de23-475d-bae3-dfcb2512d586, addrs=[127.0.0.1, 172.17.0.5], sockAddrs=[483aeb5c0a2b/172.17.0.5:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1485283222882, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false]
[disco-event-worker-#43%null%] INFO org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Topology snapshot [ver=3, servers=1, clients=0, CPUs=1, heap=0.47GB]
[exchange-worker-#45%null%] INFO org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager - Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=3, minorTopVer=0], evt=NODE_FAILED, node=0392e735-de23-475d-bae3-dfcb2512d586]
[tcp-disco-msg-worker-#2%null%] WARN org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Node is out of topology (probably, due to short-time network problems).
[disco-event-worker-#43%null%] WARN org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Local node SEGMENTED: TcpDiscoveryNode [id=663d9d58-49ad-4993-9495-2304120958c3, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.6], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 86f7fe7944ed/172.17.0.6:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1485283316748, loc=true, ver=1.7.0#20160801-sha1:383273e3, isClient=false]
[disco-event-worker-#43%null%] WARN org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Stopping local node according to configured segmentation policy.
[Thread-4] INFO org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestProtocol - Command protocol successfully stopped: TCP binary
[Thread-4] INFO org.apache.ignite.internal.processors.cache.GridCacheProcessor - Stopped cache: ignite-marshaller-sys-cache
[Thread-4] INFO org.apache.ignite.internal.processors.cache.GridCacheProcessor - Stopped cache: ignite-sys-cache
[Thread-4] INFO org.apache.ignite.internal.processors.cache.GridCacheProcessor - Stopped cache: ignite-atomics-sys-cache
[Thread-4] INFO org.apache.ignite.internal.IgniteKernal -

>>> +---------------------------------------------------------------------------------+
>>> Ignite ver. 1.7.0#20160801-sha1:383273e3f66f702de2482466dce954d570a8ccf2 stopped OK
>>> +---------------------------------------------------------------------------------+
>>> Grid uptime: 00:01:33:365



----

Lukas Lentner, B. Sc.
St.-Cajetan-Straße 13
81669 München
Deutschland
Fon:     +49 / 89  / 71 67 44 96
Mobile:  +49 / 176 / 24 77 09 22
E-Mail:  Kontakt@LukasLentner.de
Website: www.LukasLentner.de

IBAN:    DE41 7019 0000 0002 1125 58
BIC:     GENODEF1M01 (Münchner Bank)


Re: Connection problems

Posted by vkulichenko <va...@gmail.com>.
Lukas,

Check that nodes can connect to each other (i.e. there are no network
issues, no firewall or ports are opened, etc.). Another possible reason is
GC - make sure that you have enough heap memory.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Connection-problems-tp10224p10233.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.