You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mirza Aliev (Jira)" <ji...@apache.org> on 2023/10/10 07:05:00 UTC

[jira] [Updated] (IGNITE-20603) Restore topologyAugmentationMap on a node restart

     [ https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Aliev updated IGNITE-20603:
---------------------------------
    Summary: Restore topologyAugmentationMap on a node restart  (was: Restore topologyAugmentationMap)

> Restore topologyAugmentationMap on a node restart
> -------------------------------------------------
>
>                 Key: IGNITE-20603
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20603
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. *Motivation*
> It is possible that some events were propagated to {{ms.logicalTopology}}, but restart happened when we were updating topologyAugmentationMap in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't added and we need to recover this information.
> h3. *Definition of done*
> On a node restart, topologyAugmentationMap must be correctly restored according to {{ms.logicalTopology}} state.
> h3. *Implementation notes*
> For every zone, compare {{MS.local.logicalTopology.revision}} with max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is greater than max(maxScUpFromMap, maxScDownFromMap), that means that some topology changes haven't been propagated to topologyAugmentationMap before restart and appropriate timers haven't been scheduled. To fill the gap in topologyAugmentationMap, compare {{MS.local.logicalTopology}} with {{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes that did not have time to be propagated to topologyAugmentationMap before restart. {{lastSeenTopology}} is calculated in the following way: we read {{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, scaleDownTriggerKey) and retrieve all additions and removals of nodes from the topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as the left bound. After that apply these changes to the map with nodes counters from {{MS.local.dataNodes}} and take nodes only with the positive counters. This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} will tell us which nodes were not added or removed and weren't propagated to topologyAugmentationMap before restart. We take these differences and add them to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, because if some node was added to the {{ms.topology}} after immediate data nodes recalculation, this added node must restore this immediate data nodes' recalculation intent. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)