You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by oguzhan <og...@gmail.com> on 2021/02/24 10:08:59 UTC

Client node disconnected

Hello,

We have 1 client node and 1 server node and we are using ignite version
2.9.1.

Our application is scheduled to do the same jobs every day. Then our
application did not get any errors for 2 weeks, but 2 weeks later, we are
getting this error as you can see below (We get such an error about every 2
weeks): 

I hope you support to solve my problem. Thanks and best regards...


2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
TcpDiscoverySpi:576 - Failed to connect to any address from IP finder (will
retry to join topology every 2000 ms; change 'reconnectDelay' to configure
the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
/127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
/127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 - 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
    ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
minorTopVer=18985]
    ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
commPort=47101]
    ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
    ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
    ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
    ^-- Page memory [pages=0]
    ^--   sysMemPlc region [type=internal, persistence=false,
lazyAlloc=false,
      ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
    ^--   TxLog region [type=internal, persistence=false, lazyAlloc=false,
      ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
    ^--   Default_Region region [type=default, persistence=false,
lazyAlloc=true,
      ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=81, qSize=0]
2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
TcpDiscoverySpi:586 - Failed to send message: null
java.io.IOException: Failed to get acknowledge for message:
TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
[sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
isClient=true]]
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
currTimeout=600000]], addr=/127.0.0.1:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
	at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
	at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
	at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:06] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:


2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
	at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
	at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
	at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:16] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:


2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
	at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
	at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
	at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:28] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:


2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
currTimeout=600000]], addr=/127.0.0.1:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
	at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
	at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
	at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
	at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:37] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Client node disconnected

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

You can set clientReconnectDisabled to 'true' on the client nodes, in this
case the client will not try to reconnect and instead will produce an
error. When you see this error you may create a new client which will
hopefully not have these problems.

Regards,
-- 
Ilya Kasnacheev


ср, 24 февр. 2021 г. в 14:14, Oğuzhan Melez <og...@gmail.com>:

>
> Thank you. So what should i do? Client node disconnected after this error
> and client can not reconnect to the cluster until i reboot my application,
> client node and server node. How to client node reconnect to cluster?
>
> Ilya Kasnacheev <il...@gmail.com>, 24 Şub 2021 Çar, 13:57
> tarihinde şunu yazdı:
>
>> Hello!
>>
>> Looks like network problems, long GC on server node or some kind of
>> deadlock on server node which prevents it from responding.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 24 февр. 2021 г. в 13:09, oguzhan <og...@gmail.com>:
>>
>>> Hello,
>>>
>>> We have 1 client node and 1 server node and we are using ignite version
>>> 2.9.1.
>>>
>>> Our application is scheduled to do the same jobs every day. Then our
>>> application did not get any errors for 2 weeks, but 2 weeks later, we are
>>> getting this error as you can see below (We get such an error about
>>> every 2
>>> weeks):
>>>
>>> I hope you support to solve my problem. Thanks and best regards...
>>>
>>>
>>> 2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
>>> TcpDiscoverySpi:576 - Failed to connect to any address from IP finder
>>> (will
>>> retry to join topology every 2000 ms; change 'reconnectDelay' to
>>> configure
>>> the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
>>> /127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
>>> /127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
>>> 2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 -
>>> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>>>     ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
>>>     ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
>>> minorTopVer=18985]
>>>     ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
>>> commPort=47101]
>>>     ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
>>>     ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
>>>     ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
>>>     ^-- Page memory [pages=0]
>>>     ^--   sysMemPlc region [type=internal, persistence=false,
>>> lazyAlloc=false,
>>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^--   TxLog region [type=internal, persistence=false,
>>> lazyAlloc=false,
>>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^--   Default_Region region [type=default, persistence=false,
>>> lazyAlloc=true,
>>>       ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^-- Outbound messages queue [size=0]
>>>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>>>     ^-- System thread pool [active=0, idle=81, qSize=0]
>>> 2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
>>> TcpDiscoverySpi:586 - Failed to send message: null
>>> java.io.IOException: Failed to get acknowledge for message:
>>> TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
>>> [sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
>>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>> isClient=true]]
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> 2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216
>>> TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
>>> currTimeout=600000]], addr=/127.0.0.1:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216
>>> TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
>>> currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
>>> 2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:06] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
>>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:16] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
>>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:28] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
>>> currTimeout=600000]], addr=/127.0.0.1:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
>>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:37] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>

Re: Client node disconnected

Posted by Oğuzhan Melez <og...@gmail.com>.
Thank you. So what should i do? Client node disconnected after this error
and client can not reconnect to the cluster until i reboot my application,
client node and server node. How to client node reconnect to cluster?

Ilya Kasnacheev <il...@gmail.com>, 24 Şub 2021 Çar, 13:57
tarihinde şunu yazdı:

> Hello!
>
> Looks like network problems, long GC on server node or some kind of
> deadlock on server node which prevents it from responding.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 24 февр. 2021 г. в 13:09, oguzhan <og...@gmail.com>:
>
>> Hello,
>>
>> We have 1 client node and 1 server node and we are using ignite version
>> 2.9.1.
>>
>> Our application is scheduled to do the same jobs every day. Then our
>> application did not get any errors for 2 weeks, but 2 weeks later, we are
>> getting this error as you can see below (We get such an error about every
>> 2
>> weeks):
>>
>> I hope you support to solve my problem. Thanks and best regards...
>>
>>
>> 2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
>> TcpDiscoverySpi:576 - Failed to connect to any address from IP finder
>> (will
>> retry to join topology every 2000 ms; change 'reconnectDelay' to configure
>> the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
>> /127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
>> /127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
>> 2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 -
>> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>>     ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
>>     ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
>> minorTopVer=18985]
>>     ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
>> commPort=47101]
>>     ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
>>     ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
>>     ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
>>     ^-- Page memory [pages=0]
>>     ^--   sysMemPlc region [type=internal, persistence=false,
>> lazyAlloc=false,
>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^--   TxLog region [type=internal, persistence=false, lazyAlloc=false,
>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^--   Default_Region region [type=default, persistence=false,
>> lazyAlloc=true,
>>       ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^-- Outbound messages queue [size=0]
>>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>>     ^-- System thread pool [active=0, idle=81, qSize=0]
>> 2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
>> TcpDiscoverySpi:586 - Failed to send message: null
>> java.io.IOException: Failed to get acknowledge for message:
>> TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
>> [sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>> isClient=true]]
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> 2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576
>> -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
>> currTimeout=600000]], addr=/127.0.0.1:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576
>> -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
>> currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
>> 2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:06] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:16] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:28] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
>> currTimeout=600000]], addr=/127.0.0.1:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:37] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: Client node disconnected

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Looks like network problems, long GC on server node or some kind of
deadlock on server node which prevents it from responding.

Regards,
-- 
Ilya Kasnacheev


ср, 24 февр. 2021 г. в 13:09, oguzhan <og...@gmail.com>:

> Hello,
>
> We have 1 client node and 1 server node and we are using ignite version
> 2.9.1.
>
> Our application is scheduled to do the same jobs every day. Then our
> application did not get any errors for 2 weeks, but 2 weeks later, we are
> getting this error as you can see below (We get such an error about every 2
> weeks):
>
> I hope you support to solve my problem. Thanks and best regards...
>
>
> 2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
> TcpDiscoverySpi:576 - Failed to connect to any address from IP finder (will
> retry to join topology every 2000 ms; change 'reconnectDelay' to configure
> the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
> /127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
> /127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
> 2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 -
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
>     ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
> minorTopVer=18985]
>     ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
> commPort=47101]
>     ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
>     ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
>     ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
>     ^-- Page memory [pages=0]
>     ^--   sysMemPlc region [type=internal, persistence=false,
> lazyAlloc=false,
>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
> allocRam=0MB]
>     ^--   TxLog region [type=internal, persistence=false, lazyAlloc=false,
>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
> allocRam=0MB]
>     ^--   Default_Region region [type=default, persistence=false,
> lazyAlloc=true,
>       ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
> allocRam=0MB]
>     ^-- Outbound messages queue [size=0]
>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>     ^-- System thread pool [active=0, idle=81, qSize=0]
> 2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
> TcpDiscoverySpi:586 - Failed to send message: null
> java.io.IOException: Failed to get acknowledge for message:
> TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
> [sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
> isClient=true]]
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
>         at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> 2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
> Handshake timed out (will stop attempts to perform the handshake)
> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
> totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
> err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
> [maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
> currTimeout=600000]], addr=/127.0.0.1:47100,
> failureDetectionTimeoutEnabled=true, timeout=0]
> 2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
> Handshake timed out (will stop attempts to perform the handshake)
> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
> totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
> err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
> [maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
> currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
> failureDetectionTimeoutEnabled=true, timeout=0]
> 2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
> system-critical thread has been detected. This can lead to cluster-wide
> undefined behaviour [workerName=tcp-comm-worker,
> threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
> 2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
> failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>         at sun.misc.Unsafe.park(Native Method)
>         at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>         at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>         at
>
> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>         at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> [02:08:06] Possible failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> 2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
> CacheDiagnosticManager:571 - Page locks dump:
>
>
> 2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
> system-critical thread has been detected. This can lead to cluster-wide
> undefined behaviour [workerName=tcp-comm-worker,
> threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
> failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>         at sun.misc.Unsafe.park(Native Method)
>         at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>         at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>         at
>
> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>         at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> [02:08:16] Possible failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
> CacheDiagnosticManager:571 - Page locks dump:
>
>
> 2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
> system-critical thread has been detected. This can lead to cluster-wide
> undefined behaviour [workerName=tcp-comm-worker,
> threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
> failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>         at sun.misc.Unsafe.park(Native Method)
>         at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>         at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>         at
>
> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>         at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> [02:08:28] Possible failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
> CacheDiagnosticManager:571 - Page locks dump:
>
>
> 2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
> Handshake timed out (will stop attempts to perform the handshake)
> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
> totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
> err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
> [maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
> currTimeout=600000]], addr=/127.0.0.1:47100,
> failureDetectionTimeoutEnabled=true, timeout=0]
> 2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
> system-critical thread has been detected. This can lead to cluster-wide
> undefined behaviour [workerName=tcp-comm-worker,
> threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
> failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>         at sun.misc.Unsafe.park(Native Method)
>         at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>         at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>         at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>         at
>
> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at
>
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>         at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> [02:08:37] Possible failure suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
> CacheDiagnosticManager:571 - Page locks dump:
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>