You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (Jira)" <ji...@apache.org> on 2022/04/05 08:45:00 UTC

[jira] [Commented] (IGNITE-16789) Failure to dynamically create a new cache can be a cause of NullPointerException/AssertionError in the discovery thread

    [ https://issues.apache.org/jira/browse/IGNITE-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517311#comment-17517311 ] 

Ignite TC Bot commented on IGNITE-16789:
----------------------------------------

{panel:title=Branch: [pull/9930/head] Base: [master] : Possible Blockers (1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}ZooKeeper (Discovery) 1{color} [[tests 0 Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=6504843]]

{panel}
{panel:title=Branch: [pull/9930/head] Base: [master] : New Tests (3)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}Cache 5{color} [[tests 1|https://ci.ignite.apache.org/viewLog.html?buildId=6503707]]
* {color:#013220}IgniteCacheWithIndexingTestSuite: CacheQueryAfterDynamicCacheStartFailureTest.testStartAndStopFailedCache - PASSED{color}

{color:#00008b}Cache 7{color} [[tests 1|https://ci.ignite.apache.org/viewLog.html?buildId=6503701]]
* {color:#013220}IgniteCacheTestSuite7: IgniteDynamicCacheStartFailWithPersistenceTest.testStartAndStopFailedCache - PASSED{color}

{color:#00008b}Cache 4{color} [[tests 1|https://ci.ignite.apache.org/viewLog.html?buildId=6503699]]
* {color:#013220}IgniteCacheTestSuite4: IgniteDynamicCacheStartFailTest.testStartAndStopFailedCache - PASSED{color}

{panel}
[TeamCity *--&gt; Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=6503787&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Failure to dynamically create a new cache can be a cause of NullPointerException/AssertionError in the discovery thread
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-16789
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16789
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Simultaneous creating and removing a cache with the same name may lead to the following NullPointerException in the disco-notifier thread and this is the reason for triggering FailureHandler.
> {noformat}
> [2022-04-04 14:22:41,571][ERROR][disco-notifier-worker-#36%cache.IgniteDynamicCacheStartFailTest0%][GridDiscoveryManager] Exception in discovery notifier worker thread.
> java.lang.AssertionError: Dynamic cache descriptor is missing [cacheName=TestDynamicCache]
> 	at org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onCacheChangeRequested(ClusterCachesInfo.java:570)
> 	at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onCustomEvent(GridCacheProcessor.java:4307)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:680)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.access$7500(GridDiscoveryManager.java:559)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$NotificationTask.run(GridDiscoveryManager.java:994)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2852)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2890)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> 	at java.lang.Thread.run(Thread.java:748)
> {noformat}
> It looks like the issue is caused by the concurrent starting and stopping caches with the same names.
> The following scenario results in the AssertionError (in the case when assertions are disabled it will lead to the mentioned NullPointerException):
>  - the user starts a new cache with the name "A"
>  - he DynamicCacheChangeRequest is sent over the cluster ring
>  - every node, that is received this message, updates its list of registered cache descriptors (see ClusterCachesInfo.onCacheChangeRequested(DynamicCacheChangeBatch, AffinityTopologyVersion))
>  - a node initiates a new partition map exchange
>  - user tries to stop cache with the same name "A"
>  - new DynamicCacheChangeRequest is sent and, therefore it will clean up the list of registered caches
>  - at this point, the previous exchange fails for some reason (PME that is related to cache start)
>  - the DynamicCacheChangeFailureMessage is sent over the ring and tries to find the required cache descriptor on every node which is already removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)