You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (Jira)" <ji...@apache.org> on 2021/03/30 11:00:00 UTC
[jira] [Commented] (IGNITE-14394) Baseline auto-adjustment does not happen when 2+ nodes join the cluster

    [ https://issues.apache.org/jira/browse/IGNITE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311410#comment-17311410 ] 

Ignite TC Bot commented on IGNITE-14394:
----------------------------------------

{panel:title=Branch: [pull/8934/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/8934/head] Base: [master] : New Tests (2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}Basic 1{color} [[tests 2|https://ci.ignite.apache.org/viewLog.html?buildId=5935653]]
* {color:#013220}IgniteBasicTestSuite: BaselineAutoAdjustInMemoryTest.testExchangeMerge - PASSED{color}
* {color:#013220}IgniteBasicTestSuite: BaselineAutoAdjustTest.testExchangeMerge - PASSED{color}

{panel}
[TeamCity *--&gt; Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5935715&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Baseline auto-adjustment does not happen when 2+ nodes join the cluster
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-14394
>                 URL: https://issues.apache.org/jira/browse/IGNITE-14394
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the case of baseline topology autoadjustment is enabled and a few server nodes join the cluster the autoadjustment may not be triggered.
> Steps to reproduce:
> 1. start ignite cluster and enable baseline autoadjustment
> 2. start 2+ server nodes (in order to reproduce the issue, the corresponding exchange should be merged)
> 3. wait for baseline autoadjustment 
> *RESOLUTION*
> The root cause of the issue relates to merging exchanges. Let's consider the following scenario:
> 1. 2 new nodes join the cluster
> 2. the first step triggers a call BaselineTopologyUpdater#triggerBaselineUpdate twice and creates the following baseline data objects:
>     data1 - target topology version - X   
>     data2 - target topology version - X+1. This data object invalidates the previous one (data1)
> 3. in the case the exchanges are merged and the first data (data1) binds/listens to real {{GridDhtPartitionsExchangeFuture}} instead of affinityReadyFuture, see {{GridCachePartitionExchangeManager#affinityReadyFuture}} (data2 listens to affinityReadyFuture(X+1)), the implementation schedules data2 in the first place, and after that, it replaces data2 with data1 which is already invalidated at this point.
> The fix is quite obvious - we should not schedule baseline data instances that are related to "invalidated" target topology.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)