You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by "Abhishek Gupta (BLOOMBERG/ 731 LEX)" <ag...@bloomberg.net> on 2019/09/19 22:17:18 UTC

Grid suddenly went in bad state

Hello,
      I've got a 6 node grid with maxSize (dataregionconfig) set to 300G each. The grid seemed to be performing normally until at one point it started logging "Partition states validation has failed for group" warning - see attached log file.  This kept happening for about 3 minutes and then stopped (see line 85 in the attached log file).  Just then a client seems to have connected (see line 135 where connection was accepted). But soon after, it kept logging the below exception. After a while (~1 hour), it started showing logging "Partition states validation has failed for group" again (line 284). 


2019-09-19 13:28:28.601 [INFO ] [exchange-worker-#176] GridDhtPartitionsExchangeFuture - Completed partition exchange [localNode=0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d, exchange=GridDhtPartitionsExchangeFuture [topVer
=AffinityTopologyVersion [topVer=126, minorTopVer=0], evt=NODE_JOINED, evtNode=ZookeeperClusterNode [id=af5f33f4-842a-4691-8e84-da4fb19eafb2, addrs=[10.126.90.78, 10.115.76.13, 127.0.0.1], order=126, loc=false, clie
nt=true], done=true], topVer=AffinityTopologyVersion [topVer=126, minorTopVer=0], durationFromInit=0]
2019-09-19 13:28:28.601 [INFO ] [exchange-worker-#176] time - Finished exchange init [topVer=AffinityTopologyVersion [topVer=126, minorTopVer=0], crd=true]
2019-09-19 13:28:28.602 [INFO ] [exchange-worker-#176] GridCachePartitionExchangeManager - Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=126, minorTopVer=0], force=false, evt=NODE_JOI
NED, node=af5f33f4-842a-4691-8e84-da4fb19eafb2]
2019-09-19 13:28:29.513 [INFO ] [grid-nio-worker-tcp-comm-14-#130] TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/10.115.248.110:12122, rmtAddr=/10.115.76.13:45464]
2019-09-19 13:28:29.540 [INFO ] [grid-nio-worker-tcp-comm-15-#131] TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/10.115.248.110:12122, rmtAddr=/10.115.76.13:45466]
2019-09-19 13:28:29.600 [INFO ] [grid-nio-worker-tcp-comm-16-#132] TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/10.115.248.110:12122, rmtAddr=/10.115.76.13:45472]
2019-09-19 13:28:51.624 [ERROR] [grid-nio-worker-tcp-comm-17-#133] TcpCommunicationSpi - Failed to read data from remote connection (will wait for 2000ms).
org.apache.ignite.IgniteCheckedException: Failed to select events on selector.
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2182) ~[ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1794) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
Caused by: java.nio.channels.ClosedChannelException
        at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:197) ~[?:1.8.0_172]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1997) ~[ignite-core-2.7.5-0-2.jar:2.7.5]
        ... 3 more


after a lot of these exceptions and warnings, the node started throwing the below (client had started ingestion using datastreamer). And the below exceptions were seen on all the nodes

2019-09-19 15:10:38.922 [ERROR] [grid-timeout-worker-#115]  - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=Abst
ractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-str
eamer-stripe-42, igniteInstanceName=null, finished=false, heartbeatTs=1568920228643]]]
org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-42, igniteInstanceName=null, finished=false, heartbeatTs=1568920228643]
        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]


There was very high GC activity seen at that time ~ 15:10, so that would explain the last part. But this kept going on and on until it went into a tailspin with the below logs


2019-09-19 17:34:27.550 [WARN ] [zk-null-EventThread] ZookeeperClient - Failed to execute ZooKeeper operation [err=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /
apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa14f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000, state=Disconnected]
2019-09-19 17:34:27.550 [WARN ] [zk-null-EventThread] ZookeeperClient - ZooKeeper operation failed, will retry [err=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for 
/apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa14f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000, retryTimeout=2000, connLossTimeout=30000, path=/apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa1
4f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000, remainingWaitTime=20335]
2019-09-19 17:34:27.550 [WARN ] [data-streamer-stripe-41-#96] ZookeeperClient - Failed to execute ZooKeeper operation [err=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLo
ss for /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001, state=Disconnected]
2019-09-19 17:34:27.550 [ERROR] [data-streamer-stripe-41-#96] ZookeeperClient - Operation failed with unexpected error, close ZooKeeper client: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErr
orCode = ConnectionLoss for /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792) ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
        at org.apache.ignite.spi.discovery.zk.internal.ZookeeperClient.createIfNeeded(ZookeeperClient.java:443) [ignite-zookeeper-2.7.5.jar:2.7.5]
        at org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.saveMultipleParts(ZookeeperDiscoveryImpl.java:939) [ignite-zookeeper-2.7.5.jar:2.7.5]
        at org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.saveCustomMessage(ZookeeperDiscoveryImpl.java:671) [ignite-zookeeper-2.7.5.jar:2.7.5]
        at org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.sendCustomMessage(ZookeeperDiscoveryImpl.java:629) [ignite-zookeeper-2.7.5.jar:2.7.5]
        at org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi.sendCustomEvent(ZookeeperDiscoverySpi.java:414) [ignite-zookeeper-2.7.5.jar:2.7.5]
        at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.sendCustomEvent(GridDiscoveryManager.java:2207) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.requestMetadataUpdate(BinaryMetadataTransport.java:170) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:469) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.addMeta(CacheObjectBinaryProcessorImpl.java:186) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.binary.builder.BinaryObjectBuilderImpl.serializeTo(BinaryObjectBuilderImpl.java:369) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.binary.builder.BinaryObjectBuilderImpl.build(BinaryObjectBuilderImpl.java:190) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at com.bloomberg.fdns.cass.ignition.datareader.CassStreamReceiver.receive(CassStreamReceiver.java:42) [datagrid-cass-dependencies.jar:?]
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:137) [ignite-core-2.7.5-0-2.jar:2.7.5]
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:400) [ignite-core-2.7.5-0-2.jar:2.7.5]
....

2019-09-19 17:34:27.686 [WARN ] [data-streamer-stripe-41-#96] ZookeeperDiscoveryImpl - Connection to Zookeeper server is lost, local node SEGMENTED.
2019-09-19 17:34:27.687 [WARN ] [disco-event-worker-#175] GridDiscoveryManager - Local node SEGMENTED: ZookeeperClusterNode [id=0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d, addrs=[10.115.248.110, 10.126.230.37, 127.0.0.1],
 order=109, loc=true, client=false]
2019-09-19 17:34:27.688 [ERROR] [disco-event-worker-#175]  - Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailur
eTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SEGMENTATION, err=null]]


Any help in understanding what triggered these series of event will be very helpful.


Thanks,
Abhishek

Re: Grid suddenly went in bad state

Posted by Igor Belyakov <ig...@gmail.com>.

Hi,

Could please also attach logs for other nodes? And what version of Ignite
you're currently using?

Also you've mentioned high GC activity, is it possible to provide GC logs?

Regards,
Igor

On Fri, Sep 20, 2019 at 1:17 AM Abhishek Gupta (BLOOMBERG/ 731 LEX) <
agupta726@bloomberg.net> wrote:

> Hello,
> I've got a 6 node grid with maxSize (dataregionconfig) set to 300G each.
> The grid seemed to be performing normally until at one point it started
> logging "Partition states validation has failed for group" warning - see
> attached log file. This kept happening for about 3 minutes and then stopped
> (see line 85 in the attached log file). Just then a client seems to have
> connected (see line 135 where connection was accepted). But soon after, it
> kept logging the below exception. After a while (~1 hour), it started
> showing logging "Partition states validation has failed for group" again
> (line 284).
>
>
> 2019-09-19 13:28:28.601 [INFO ] [exchange-worker-#176]
> GridDhtPartitionsExchangeFuture - Completed partition exchange
> [localNode=0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d,
> exchange=GridDhtPartitionsExchangeFuture [topVer
> =AffinityTopologyVersion [topVer=126, minorTopVer=0], evt=NODE_JOINED,
> evtNode=ZookeeperClusterNode [id=af5f33f4-842a-4691-8e84-da4fb19eafb2,
> addrs=[10.126.90.78, 10.115.76.13, 127.0.0.1], order=126, loc=false, clie
> nt=true], done=true], topVer=AffinityTopologyVersion [topVer=126,
> minorTopVer=0], durationFromInit=0]
> 2019-09-19 13:28:28.601 [INFO ] [exchange-worker-#176] time - Finished
> exchange init [topVer=AffinityTopologyVersion [topVer=126, minorTopVer=0],
> crd=true]
> 2019-09-19 13:28:28.602 [INFO ] [exchange-worker-#176]
> GridCachePartitionExchangeManager - Skipping rebalancing (nothing
> scheduled) [top=AffinityTopologyVersion [topVer=126, minorTopVer=0],
> force=false, evt=NODE_JOI
> NED, node=af5f33f4-842a-4691-8e84-da4fb19eafb2]
> 2019-09-19 13:28:29.513 [INFO ] [grid-nio-worker-tcp-comm-14-#130]
> TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/
> 10.115.248.110:12122, rmtAddr=/10.115.76.13:45464]
> 2019-09-19 13:28:29.540 [INFO ] [grid-nio-worker-tcp-comm-15-#131]
> TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/
> 10.115.248.110:12122, rmtAddr=/10.115.76.13:45466]
> 2019-09-19 13:28:29.600 [INFO ] [grid-nio-worker-tcp-comm-16-#132]
> TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/
> 10.115.248.110:12122, rmtAddr=/10.115.76.13:45472]
> 2019-09-19 13:28:51.624 [ERROR] [grid-nio-worker-tcp-comm-17-#133]
> TcpCommunicationSpi - Failed to read data from remote connection (will wait
> for 2000ms).
> org.apache.ignite.IgniteCheckedException: Failed to select events on
> selector.
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2182)
> ~[ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1794)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> Caused by: java.nio.channels.ClosedChannelException
> at
> java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:197)
> ~[?:1.8.0_172]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1997)
> ~[ignite-core-2.7.5-0-2.jar:2.7.5]
> ... 3 more
>
>
> after a lot of these exceptions and warnings, the node started throwing
> the below (client had started ingestion using datastreamer). And the below
> exceptions were seen on all the nodes
>
> 2019-09-19 15:10:38.922 [ERROR] [grid-timeout-worker-#115] - Critical
> system error detected. Will be handled accordingly to configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=Abst
> ractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED,
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
> [name=data-str
> eamer-stripe-42, igniteInstanceName=null, finished=false,
> heartbeatTs=1568920228643]]]
> org.apache.ignite.IgniteException: GridWorker
> [name=data-streamer-stripe-42, igniteInstanceName=null, finished=false,
> heartbeatTs=1568920228643]
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
>
>
> There was very high GC activity seen at that time ~ 15:10, so that would
> explain the last part. But this kept going on and on until it went into a
> tailspin with the below logs
>
>
> 2019-09-19 17:34:27.550 [WARN ] [zk-null-EventThread] ZookeeperClient -
> Failed to execute ZooKeeper operation
> [err=org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /
> apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa14f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000,
> state=Disconnected]
> 2019-09-19 17:34:27.550 [WARN ] [zk-null-EventThread] ZookeeperClient -
> ZooKeeper operation failed, will retry
> [err=org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for
> /apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa14f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000,
> retryTimeout=2000, connLossTimeout=30000,
> path=/apacheIgnite/cassgrid-nj/cp/389e297a-e3f5-4735-9194-5caaa1
> 4f8eb7:34d021e7-3f7e-4cc8-83aa-adad0de00553:0000, remainingWaitTime=20335]
> 2019-09-19 17:34:27.550 [WARN ] [data-streamer-stripe-41-#96]
> ZookeeperClient - Failed to execute ZooKeeper operation
> [err=org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLo
> ss for
> /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001,
> state=Disconnected]
> 2019-09-19 17:34:27.550 [ERROR] [data-streamer-stripe-41-#96]
> ZookeeperClient - Operation failed with unexpected error, close ZooKeeper
> client: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErr
> orCode = ConnectionLoss for
> /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for
> /apacheIgnite/cassgrid-nj/cp/d284ad22-03c4-4281-9221-6eddad9bf313:0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d:0001
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792)
> ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperClient.createIfNeeded(ZookeeperClient.java:443)
> [ignite-zookeeper-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.saveMultipleParts(ZookeeperDiscoveryImpl.java:939)
> [ignite-zookeeper-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.saveCustomMessage(ZookeeperDiscoveryImpl.java:671)
> [ignite-zookeeper-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl.sendCustomMessage(ZookeeperDiscoveryImpl.java:629)
> [ignite-zookeeper-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi.sendCustomEvent(ZookeeperDiscoverySpi.java:414)
> [ignite-zookeeper-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.sendCustomEvent(GridDiscoveryManager.java:2207)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.requestMetadataUpdate(BinaryMetadataTransport.java:170)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:469)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.addMeta(CacheObjectBinaryProcessorImpl.java:186)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.binary.builder.BinaryObjectBuilderImpl.serializeTo(BinaryObjectBuilderImpl.java:369)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.binary.builder.BinaryObjectBuilderImpl.build(BinaryObjectBuilderImpl.java:190)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> com.bloomberg.fdns.cass.ignition.datareader.CassStreamReceiver.receive(CassStreamReceiver.java:42)
> [datagrid-cass-dependencies.jar:?]
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:137)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:400)
> [ignite-core-2.7.5-0-2.jar:2.7.5]
> ....
>
> 2019-09-19 17:34:27.686 [WARN ] [data-streamer-stripe-41-#96]
> ZookeeperDiscoveryImpl - Connection to Zookeeper server is lost, local node
> SEGMENTED.
> 2019-09-19 17:34:27.687 [WARN ] [disco-event-worker-#175]
> GridDiscoveryManager - Local node SEGMENTED: ZookeeperClusterNode
> [id=0c643dd0-a884-4fd0-acb3-a6d7e2c5e71d, addrs=[10.115.248.110,
> 10.126.230.37, 127.0.0.1],
> order=109, loc=true, client=false]
> 2019-09-19 17:34:27.688 [ERROR] [disco-event-worker-#175] - Critical
> system error detected. Will be handled accordingly to configured handler
> [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailur
> eTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SEGMENTATION, err=null]]
>
>
>
> Any help in understanding what triggered these series of event will be
> very helpful.
>
>
> Thanks,
> Abhishek
>
>
>
>
>
>
>