You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Kukushkin (Jira)" <ji...@apache.org> on 2020/04/02 06:13:00 UTC
[jira] [Commented] (IGNITE-12828) Intermittent [Failed to notify
direct custom event listener] exception on node shutdown
[ https://issues.apache.org/jira/browse/IGNITE-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073403#comment-17073403 ]
Alexey Kukushkin commented on IGNITE-12828:
-------------------------------------------
The problem is due to lack of synchronization between the node shutdown and continuous query handler initialization: * The problem occurred due to [[NPE here|https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/StartRoutineDiscoveryMessage.java#L95]|https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/StartRoutineDiscoveryMessage.java#L95], which is [[called from continuous query handler initialization|https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/GridContinuousProcessor.java#L219]|https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/GridContinuousProcessor.java#L219]
* cntrs is set to null on the node shutdown
* There is no reliable synchronization on the node shutdown. There are GridKernalContext#isStopping checks spread all over the code to detect whether the node is shutting down.
* [Here|https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/GridContinuousProcessor.java#L216] is the last node shutdown check on continuous query handler initialization. But there are time-consuming things happening between that check and the NPE from this problem like [deploying classes|[https://github.com/apache/ignite/blob/341b01dfd8abf2d9b01d468ad1bb26dfe84ac4f6/modules/core/src/main/java/org/apache/ignite/internal/processors/continuous/GridContinuousProcessor.java#L1412].] The NPE occurs if the node shutdown started in parallel.
> Intermittent [Failed to notify direct custom event listener] exception on node shutdown
> ---------------------------------------------------------------------------------------
>
> Key: IGNITE-12828
> URL: https://issues.apache.org/jira/browse/IGNITE-12828
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.8
> Reporter: Alexey Kukushkin
> Assignee: PetrovMikhail
> Priority: Major
> Labels: sbcf
> Attachments: ignite-12828-vs-2.8.patch
>
>
> +*Reproducer*+:
> Run a server node
> Run a client node that:
> * Creates cache "cache1"
> * Deploys a grid service that starts a continuous query against "cache1" in method init()
> * Leaves the cluster
> +*Actual result*+
> Intermittent exception in the client node:
> {noformat}
> [16:54:38,758][SEVERE][disco-notifier-worker-#43%CashFlowCluster_16b67e98563f4cfbac95ae055a00e67f%][GridDiscoveryManager] Failed to notify direct custom event listener: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=sbt.cashflow.grid.services.cachefactory.ignite.NodeAttributeFilter@63ae71a9, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandler [returnValTrans=o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler$1@594bf5b8, cacheName=CALC_REQUESTS, rmtFilter=null, rmtFilterDep=null, internal=false, notifyExisting=false, oldValRequired=true, sync=false, ignoreExpired=true, taskHash=0, skipPrimaryCheck=false, locOnly=false, keepBinary=true, ackBuf=null, cacheId=-1608655250, initTopVer=null, nodeLeft=false, ignoreClsNotFound=false, nodeId=null, routineId=null], bufSize=1, interval=0, autoUnsubscribe=true], keepBinary=true, routineId=021dd2ce-3d8a-41c1-a4d0-b625ea1284f4]
> java.lang.NullPointerException
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:82)
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:96)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1424)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:110)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:202)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:193)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:722)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:601)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2683)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2721)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> at java.lang.Thread.run(Thread.java:745)
> [16:54:39,725][SEVERE][disco-notifier-worker-#43%CashFlowCluster_16b67e98563f4cfbac95ae055a00e67f%][GridDiscoveryManager] Failed to notify direct custom event listener: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=sbt.cashflow.grid.services.cachefactory.ignite.NodeAttributeFilter@7462c96c, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandler [returnValTrans=o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler$1@6451dd70, cacheName=DISTRIBUTED_REQUESTS, rmtFilter=null, rmtFilterDep=null, internal=false, notifyExisting=false, oldValRequired=true, sync=false, ignoreExpired=true, taskHash=0, skipPrimaryCheck=false, locOnly=false, keepBinary=true, ackBuf=null, cacheId=1419803136, initTopVer=null, nodeLeft=false, ignoreClsNotFound=false, nodeId=null, routineId=null], bufSize=1, interval=0, autoUnsubscribe=true], keepBinary=true, routineId=1fca5f04-d220-49ac-850a-0d4527e22eef]
> java.lang.NullPointerException
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:82)
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:96)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1424)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:110)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:202)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:193)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:722)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:601)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2683)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2721)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> at java.lang.Thread.run(Thread.java:745)
> [16:54:40,809][SEVERE][disco-notifier-worker-#43%CashFlowCluster_16b67e98563f4cfbac95ae055a00e67f%][GridDiscoveryManager] Failed to notify direct custom event listener: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=sbt.cashflow.grid.services.cachefactory.ignite.NodeAttributeFilter@4a29e4c8, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandler [returnValTrans=o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler$1@28627d48, cacheName=DISTRIBUTED_REQUESTS, rmtFilter=null, rmtFilterDep=null, internal=false, notifyExisting=false, oldValRequired=true, sync=false, ignoreExpired=true, taskHash=0, skipPrimaryCheck=false, locOnly=false, keepBinary=true, ackBuf=null, cacheId=1419803136, initTopVer=null, nodeLeft=false, ignoreClsNotFound=false, nodeId=null, routineId=null], bufSize=1, interval=0, autoUnsubscribe=true], keepBinary=true, routineId=aa0bdf4f-bfdb-4eb3-8d99-6bcb67532704]
> java.lang.NullPointerException
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:82)
> at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:96)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1424)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:110)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:202)
> at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:193)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:722)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:601)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2683)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2721)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)