You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vyacheslav Koptilin (Jira)" <ji...@apache.org> on 2021/10/26 10:45:00 UTC

[jira] [Commented] (IGNITE-15364) The rebalancing can be broken if historical rebalancing is reassigned after the client node joined the cluster.

    [ https://issues.apache.org/jira/browse/IGNITE-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434254#comment-17434254 ] 

Vyacheslav Koptilin commented on IGNITE-15364:
----------------------------------------------

Hi [~ascherbakov], [~av],

I fixed the review comment and added new tests. Could you please take a look?

> The rebalancing can be broken if historical rebalancing is reassigned after the client node joined the cluster.
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-15364
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15364
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Looks like the following scenario can break data consistency after rebalancing:
>  - start and activate the cluster of three server nodes
>  - create a cache with two backups and fill initial data into it
>  - stop one server node and upload additional data to the cache in order to trigger historical rebalance after the node returns to the cluster
>  - restart the node. make sure that historical rebalancing is started from two other nodes.
>  - before rebalancing is completed a new client node should be started and joined the cluster. this leads to clean up partition update counters on server nodes, i.e. _GridDhtPartitionTopologyImpl#cntrMap_. ( * )
>  - historical rebalancing from one node fails.
>  - in that case, rebalancing is reassigned and starting node tries to rebalance missed partitions from another node.
> unfortunately, update counters for historical rebalance cannot be properly calculated due to ( * )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)