You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mahesh Renduchintala (JIRA)" <ji...@apache.org> on 2019/01/04 06:53:00 UTC

[jira] [Updated] (IGNITE-8728) Healthy nodes fail after a failed node recovers and rejoins the baseline topology

     [ https://issues.apache.org/jira/browse/IGNITE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahesh Renduchintala updated IGNITE-8728:
-----------------------------------------
    Affects Version/s:     (was: 2.5)
                       2.7

> Healthy nodes fail after a failed node recovers and rejoins the baseline topology
> ---------------------------------------------------------------------------------
>
>                 Key: IGNITE-8728
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8728
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7
>            Reporter: Mahesh Renduchintala
>            Priority: Major
>
> I have two nodes on which we have 3 tables which are partitioned.  Index are also built on these tables. 
> For 24 hours caches work fine.  The tables are definitely distributed across both the nodes
> Node 2 reboots, ignite service gets started on Node 2 and in Node 1 we see the below crash. 
>  
> [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.1.7, rmtPort=45102]
>  [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.1.7, rmtPort=45102]
>  [10:38:35,437][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102]
>  [10:38:35,451][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102
>  [10:38:35,457][SEVERE][tcp-disco-msg-worker-#3|#3][TcpDiscoverySpi] TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability.
>  java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174)
>  at org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
>  at org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360)
>  at org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536)
>  at org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621)
>  at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>  [10:38:35,459][SEVERE][tcp-disco-msg-worker-#3|#3][] Critical system error detected. Will be handled accordingly to configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Duplicate key]]
>  java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174)
>  at org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
>  at org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360)
>  at org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536)
>  at org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775)
>  at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621)
>  at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>  [10:38:35,460][SEVERE][tcp-disco-msg-worker-#3|#3][] JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Duplicate key]]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)