You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ivan Daschinskiy (Jira)" <ji...@apache.org> on 2020/11/10 10:38:00 UTC

[jira] [Updated] (IGNITE-13690) Failed to init coordinator caches on concurrent start of nodes with different cache configurations.

     [ https://issues.apache.org/jira/browse/IGNITE-13690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Daschinskiy updated IGNITE-13690:
--------------------------------------
    Attachment: DifferentCacheConfigurationConcurrentStart.java

> Failed to init coordinator caches on concurrent start of nodes with different cache configurations.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-13690
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13690
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.9
>            Reporter: Ivan Daschinskiy
>            Priority: Major
>         Attachments: DifferentCacheConfigurationConcurrentStart.java
>
>
> Scenario:
> 1. Start simultaneously nodes with different cache configurations
> (for simplicity, let client nodes be with configured caches, servers without).
> 2. When processing first exchange on coordinator, coordinator will fail with 
> {code:java}
> [2020-11-10 13:23:57,232][ERROR][start-node-1][DifferentCacheConfigurationConcurrentStart0] Got exception while starting (will rollback startup routine).
> java.lang.AssertionError: Invalid exchange futures state [cur=6, total=7]
> 	at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$17.applyx(CacheAffinitySharedManager.java:1964)
> 	at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$17.applyx(CacheAffinitySharedManager.java:1935)
> 	at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.lambda$forAllRegisteredCacheGroups$e0a6939d$1(CacheAffinitySharedManager.java:1265)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11157)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> 	at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllRegisteredCacheGroups(CacheAffinitySharedManager.java:1264)
> 	at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.initCoordinatorCaches(CacheAffinitySharedManager.java:1935)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCoordinatorCaches(GridDhtPartitionsExchangeFuture.java:716)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:850)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3175)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at java.lang.Thread.run(Thread.java:748)
> {code}
> The main reason is the race on creating {{LocalJoinCachesContext}}, so local join caches differs from registered caches from other nodes. 
> Reproducer for zk and ring discoveries are attached. 
> NB! Not always reproducible -- to increase probability of fail, add sleep in 
> {{GridDhtPartitionsExchangeFuture#init}}
> {code:java}
>  public void init(boolean newCrd) throws IgniteInterruptedCheckedException {
>         if (newCrd)
>             U.sleep(500);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)