You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Goncharuk (JIRA)" <ji...@apache.org> on 2019/03/25 08:25:00 UTC

[jira] [Resolved] (IGNITE-11616) NPE in MvccProcessorImpl when stopping a starting node

     [ https://issues.apache.org/jira/browse/IGNITE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Goncharuk resolved IGNITE-11616.
---------------------------------------
    Resolution: Duplicate

> NPE in MvccProcessorImpl when stopping a starting node
> ------------------------------------------------------
>
>                 Key: IGNITE-11616
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11616
>             Project: Ignite
>          Issue Type: Test
>          Components: sql
>            Reporter: Alexey Goncharuk
>            Priority: Major
>             Fix For: 2.8
>
>
> I observe the following NPE in IgniteBaselineAffinityTopologyActivationTest.
> It happens because we shutdown when MVCC coordinator is not assigned yet
> {code}
> java.lang.NullPointerException
> 	at java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
> 	at java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
> 	at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorFailed(MvccProcessorImpl.java:527)
> 	at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onKernalStop(MvccProcessorImpl.java:459)
> 	at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2335)
> 	at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2283)
> 	at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1194)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1992)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1683)
> 	at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1109)
> 	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:607)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:984)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:925)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:913)
> 	at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.startGridWithConsistentId(IgniteBaselineAffinityTopologyActivationTest.java:729)
> 	at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.testNodeWithBltIsNotAllowedToJoinClusterDuringFirstActivation(IgniteBaselineAffinityTopologyActivationTest.java:532)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2102)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> From the first glance it looks like we can simply ignore the {{null}} node ID, however, there is a race - in {{onKernalStop}} we block a busy lock and remove discovery listener, then do a coordinator cleanup. However, the discovery notification worker is only stopped in {{stop}} phase, but MVCC manager does a cleanup in {{onKernalStop}} phase - so listener can execute some code after the {{onKernalStop}} is executed because there is no busy lock protection in the discovery listener itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)