You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Nikolay Izhikov (Jira)" <ji...@apache.org> on 2020/04/16 15:29:00 UTC
[jira] [Updated] (IGNITE-12398) Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes getting down if we connect Ignite Visor Command Line Interface

     [ https://issues.apache.org/jira/browse/IGNITE-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nikolay Izhikov updated IGNITE-12398:
-------------------------------------
    Description: 
We have Apache Ignite 3 node cluster setup with Amazon S3 Based Discovery. If we connect any one cluster node using Ignite Visor Command Line Interface it got hang and automatically cluster nodes(all the three nodes) are going down. Please find the below exception stacktrace.

{noformat}
[SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]
java.lang.NullPointerException
at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.key(TcpDiscoveryS3IpFinder.java:247)
at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.registerAddresses(TcpDiscoveryS3IpFinder.java:205)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:4616)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4232)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2816)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2611)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7188)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[10:36:54,600][SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
at org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:169)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:153)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[10:36:59] Ignite node stopped OK [name=DataStoreIgniteCache, uptime=00:01:13.934]
{noformat}

  was:
We have Apache Ignite 3 node cluster setup with Amazon S3 Based Discovery. If we connect any one cluster node using Ignite Visor Command Line Interface it got hang and automatically cluster nodes(all the three nodes) are going down. Please find the below exception stacktrace.

[SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]
java.lang.NullPointerException
at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.key(TcpDiscoveryS3IpFinder.java:247)
at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.registerAddresses(TcpDiscoveryS3IpFinder.java:205)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:4616)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4232)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2816)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2611)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7188)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[10:36:54,600][SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
at org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:169)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:153)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[10:36:59] Ignite node stopped OK [name=DataStoreIgniteCache, uptime=00:01:13.934]


> Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes getting down if we connect Ignite Visor Command Line Interface
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-12398
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12398
>             Project: Ignite
>          Issue Type: Bug
>          Components: aws, general, s3, visor
>    Affects Versions: 2.7
>         Environment: Production
>            Reporter: Ravi Kumar Powli
>            Assignee: Emmanouil Gkatziouras
>            Priority: Blocker
>             Fix For: 2.8.1
>
>
> We have Apache Ignite 3 node cluster setup with Amazon S3 Based Discovery. If we connect any one cluster node using Ignite Visor Command Line Interface it got hang and automatically cluster nodes(all the three nodes) are going down. Please find the below exception stacktrace.
> {noformat}
> [SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]
> java.lang.NullPointerException
> at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.key(TcpDiscoveryS3IpFinder.java:247)
> at org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.registerAddresses(TcpDiscoveryS3IpFinder.java:205)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:4616)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4232)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2816)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2611)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7188)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [10:36:54,600][SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]]]
> class org.apache.ignite.IgniteException: GridWorker [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, finished=true, heartbeatTs=1574332614423]
> at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:169)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:153)
> at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [10:36:59] Ignite node stopped OK [name=DataStoreIgniteCache, uptime=00:01:13.934]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)