You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vyacheslav Koptilin (Jira)" <ji...@apache.org> on 2020/09/08 09:51:00 UTC

[jira] [Updated] (IGNITE-13358) Improvements for partition clearing related parts

     [ https://issues.apache.org/jira/browse/IGNITE-13358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vyacheslav Koptilin updated IGNITE-13358:
-----------------------------------------
    Fix Version/s: 2.10

> Improvements for partition clearing related parts
> -------------------------------------------------
>
>                 Key: IGNITE-13358
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13358
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Scherbakov
>            Assignee: Alexey Scherbakov
>            Priority: Major
>             Fix For: 2.10
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> We have several issues related to a partition clearing worth fixing.
> 1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.
> 2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.
> 3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.
> 4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract. Better to submit such partitions for clearing to the rebalancing pool before each group starts to rebalance. This will allow faster rebalancing (accoring to configured rebalance pool size) and will provide rebalanceOrder guarantees.
> 5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.
> 6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).
> 7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.
> 8. Partition eviction causes system pool tasks starvation if a number of threads in system pool=1. This can break crucial functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)