You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Anton Vinogradov (Jira)" <ji...@apache.org> on 2019/08/21 11:59:00 UTC
[jira] [Comment Edited] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated

    [ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910374#comment-16910374 ] 

Anton Vinogradov edited comment on IGNITE-3195 at 8/21/19 11:58 AM:
--------------------------------------------------------------------

[~Mmuzaf], [~xtern]

I've updated the PR.

Now, we have 2 thread pools for rebalance (any objections?).
1) Plain thread pool used to handle supplied messages in case they are not historical.
2) Striped pool used to handle historical supply messages and all demand messages. 
Striped pool hashed by node id.

It looks like we able to get rig of striped pool in future, but for now it looks like a good and simple solution.
Historical rebalance can be reordered in case we'll invent tombstones for removes.
Supplier also theoretically able to be rewritten.


was (Author: avinogradov):
[~Mmuzaf], [~xtern]

I've updated the PR.

Now, we have 2 thread pools for rebalance (any objections?).
1) Plain thread pool used to handle supplied messages in case they are not historical.
2) Striped pool used to handle historical supply messages and all demand messages. 
Striped pool hashed by node id.

It looks like we able to get rig of striped pool in future, but for now it looks like a good and simple solution.
Historical rebalance can be reordered in case we'll invent tombstones for removes.
Supplier also theoretically able to be rewritten.

BTW, according to my checks single partition rebalance speed increased almost twice because of unstriped pool usage to handle supply messages.

So, addition to previous message is one more pool (striped).

> Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-3195
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3195
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>            Reporter: Denis Magda
>            Assignee: Anton Vinogradov
>            Priority: Major
>              Labels: iep-16
>             Fix For: 2.8
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Presently it's considered that the maximum number of threads that has to process all demand and supply messages coming from all the nodes must not be bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> Current implementation relies on ordered messages functionality creating a number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> However, the implementation doesn't take into account that ordered messages, that correspond to a particular topic, are processed in parallel for different nodes. Refer to the implementation of {{GridIoManager.processOrderedMessage}} to see that for every topic there will be a unique {{GridCommunicationMessageSet}} for every node.
> Also to prove that this is true you can refer to this execution stack 
> {noformat}
> java.lang.RuntimeException: HAPPENED DEMAND
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> All this means that in fact the number of threads that will be busy with replication activity will be equal to {{IgniteConfiguration.rebalanceThreadPoolSize}} x number_of_nodes_participated_in_rebalancing



--
This message was sent by Atlassian Jira
(v8.3.2#803003)