You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Scherbakov (Jira)" <ji...@apache.org> on 2020/08/14 07:53:00 UTC
[jira] [Updated] (IGNITE-13358) Improvements for partition clearing
related parts
[ https://issues.apache.org/jira/browse/IGNITE-13358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Scherbakov updated IGNITE-13358:
---------------------------------------
Description:
We have several issues related to a partition clearing worth fixing.
1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.
2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.
3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.
4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract.
5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.
6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).
7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.
8. Partition eviction causes system pool starvation if a number of thread in system pool is < 8. This can break crucial functionality.
was:
We have several issues related to a partition clearing worth fixing.
1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.
2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.
3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.
4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract.
5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.
6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).
7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.
> Improvements for partition clearing related parts
> -------------------------------------------------
>
> Key: IGNITE-13358
> URL: https://issues.apache.org/jira/browse/IGNITE-13358
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexey Scherbakov
> Assignee: Alexey Scherbakov
> Priority: Major
>
> We have several issues related to a partition clearing worth fixing.
> 1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.
> 2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.
> 3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.
> 4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract.
> 5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.
> 6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).
> 7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.
> 8. Partition eviction causes system pool starvation if a number of thread in system pool is < 8. This can break crucial functionality.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)