You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vladislav Pyatkov (Jira)" <ji...@apache.org> on 2020/06/29 08:03:00 UTC
[jira] [Updated] (IGNITE-12935) Disadvantages in log of historical
rebalance
[ https://issues.apache.org/jira/browse/IGNITE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladislav Pyatkov updated IGNITE-12935:
---------------------------------------
Reviewer: Alexey Scherbakov
> Disadvantages in log of historical rebalance
> --------------------------------------------
>
> Key: IGNITE-12935
> URL: https://issues.apache.org/jira/browse/IGNITE-12935
> Project: Ignite
> Issue Type: Improvement
> Reporter: Vladislav Pyatkov
> Assignee: Vladislav Pyatkov
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> # Mention in the log only partitions for which there are no nodes that suit as historical supplier
> For these partitions, print minimal counter (since which we should perform historical rebalancing) with corresponding node and maximum reserved counter (since which cluster can perform historical rebalancing) with corresponding node.
> This will let us know:
> ## Whether history was reserved at all
> ## How much reserved history we lack to perform a historical rebalancing
> ## I see resulting output like this:
> {noformat}
> Historical rebalancing wasn't scheduled for some partitions:
> History wasn't reserved for: [list of partitions and groups]
> History was reserved, but minimum present counter is less than maximum reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, maxReservedNodeId=ID], ...]{noformat}
> ## We can also aggregate previous message by (minNodeId) to easily find the exact node (or nodes) which were the reason of full rebalance.
> # Log results of {{reserveHistoryForExchange()}}. They can be compactly represented as mappings: {{(grpId -> checkpoint (id, timestamp))}}. For every group, also log message about why the previous checkpoint wasn't successfully reserved.
> There can be three reasons:
> ## Previous checkpoint simply isn't present in the history (the oldest is reserved)
> ## WAL reservation failure (call below returned false)
> {code:java}
> chpEntry = entry(cpTs);
> boolean reserved = cctx.wal().reserve(chpEntry.checkpointMark());// If checkpoint WAL history can't be reserved, stop searching.
> if (!reserved)
> break;{code}
> ## Checkpoint was marked as inapplicable for historical rebalancing
> {code:java}
> for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet()))
> if (!isCheckpointApplicableForGroup(grpId, chpEntry))
> groupsAndPartitions.remove(grpId);{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)