You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mirza Aliev (Jira)" <ji...@apache.org> on 2023/05/26 13:41:00 UTC

[jira] [Updated] (IGNITE-19288) A race on scheduling data nodes updates if there new nodes and stopped nodes in logical topology

     [ https://issues.apache.org/jira/browse/IGNITE-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Aliev updated IGNITE-19288:
---------------------------------
    Epic Link: IGNITE-19577

> A race on scheduling data nodes updates if there new nodes and stopped nodes in logical topology
> ------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-19288
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19288
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> If new logical topology has a new nodes and nodes that left cluster then DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the same entry in topologyAugmentationMap. So scale up puts entry with some revision and then scale down puts entry with the same revision as key.
> The issue is reproduced by DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
> h3. Definition of Done
>  * Concurrency bug is fixed.
>  * Test is enabled.
> UPD: 
> The problem in general could be reproducible in very rare case, namely in the scenario, when we have received {{LogicalTopologyEventListener#onTopologyLeap}} and there were added and removed nodes in this Topology comparing with the topology from metastorage.
> The solution is to change representation of the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}. 
> We have 
> {code:java}
>     private static class Augmentation {
>         /** Names of the node. */
>         Set<NodeWithAttributes> nodes;
>         /** Flag that indicates whether {@code nodeNames} should be added or removed. */
>         boolean addition;
>         Augmentation(Set<NodeWithAttributes> nodes, boolean addition) {
>             this.nodes = nodes;
>             this.addition = addition;
>         }
>     }
> {code}
> I suggest to store flag addition in the {{NodeWithAttributes}}, so we could have different types of node in terms of added or removed node for a revision in the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)