You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Gaurav Aggarwal (Jira)" <ji...@apache.org> on 2021/11/16 13:47:00 UTC

[jira] [Updated] (IGNITE-15886) Intermittent [Failed to send message to next node] exception on node shutdown

     [ https://issues.apache.org/jira/browse/IGNITE-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gaurav Aggarwal updated IGNITE-15886:
-------------------------------------
    Summary: Intermittent [Failed to send message to next node] exception on node shutdown  (was: Intermittent [Failed to send message to next nod] exception on node shutdown)

> Intermittent [Failed to send message to next node] exception on node shutdown
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-15886
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15886
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.9
>            Reporter: Gaurav Aggarwal
>            Assignee: Mikhail Petrov
>            Priority: Major
>
> {+}*Reproducer*{+}:
> Run a cluster with few nodes, on bringing down one of the nodes, the other nodes throw this exception intermittently and eventually come down
> +*Actual result*+
> Intermittent exception in one of the nodes:
> {noformat}
> SEVERE: Failed to notify direct custom event listener: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=com.bfm.libignite.cluster.filter.ClusterGroupFilter@3a0d7950, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandlerV2 [rmtFilterFactoryDep=null, types=0], bufSize=1, interval=0, autoUnsubscribe=true], keepBinary=false, deserEx=null, routineId=a452f5c7-4a27-4a5e-be3e-77ddae5d50a3]
> java.lang.NullPointerException
>         at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:95)
>         at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:109)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1482)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:717)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:526)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2677)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2715)
>         at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at java.lang.Thread.run(Thread.java:748)Nov 03, 2021 12:11:00 PM org.apache.ignite.logger.java.JavaLogger error
> SEVERE: Failed to notify direct custom event listener: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=com.bfm.libignite.cluster.filter.ClusterGroupFilter@67cc413e, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandlerV2 [rmtFilterFactoryDep=null, types=0], bufSize=1, interval=0, autoUnsubscribe=true], keepBinary=false, deserEx=null, routineId=c2ce7d01-5c5b-4654-8942-bd7ab6b90351]
> java.lang.NullPointerException
>         at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:95)
>         at org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:109)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1482)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
>         at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:717)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:526)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2677)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2715)
>         at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at java.lang.Thread.run(Thread.java:748)
> {noformat}
> Node terminates after giving a bunch of these exceptions : Failed to send message to next node
> {noformat}
> Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Failed to send message to next node, try previous [msg=TcpDiscoveryMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage [...]]
> Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Unable to connect to next nodes in a ring, it seems local node is experiencing connectivity issues. Segmenting local node to avoid case when one node fails a big part of cluster. To disable this behavior set TcpDiscoverySpi.setConnectionRecoveryTimeout() to 0. [connRecoveryTimeout=10000, effectiveConnRecoveryTimeout=10000, failedNodes=[TcpDiscoveryNode [...]
> Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger info
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)