You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Stanilovsky Evgeny (JIRA)" <ji...@apache.org> on 2019/07/05 07:15:00 UTC

[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated

    [ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879031#comment-16879031 ] 

Stanilovsky Evgeny commented on IGNITE-3195:
--------------------------------------------

look like we clash the same problem on blt change:

{code:java}
2019-07-04 06:29:03.649[WARN ][sys-#328%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#328%DPL_GRID%DplGridNodeName% for timeout(ms)=16335
2019-07-04 06:29:03.649[WARN ][sys-#326%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#326%DPL_GRID%DplGridNodeName% for timeout(ms)=13438
2019-07-04 06:29:03.649[WARN ][sys-#277%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#277%DPL_GRID%DplGridNodeName% for timeout(ms)=11609
2019-07-04 06:29:03.649[WARN ][sys-#331%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#331%DPL_GRID%DplGridNodeName% for timeout(ms)=18009
2019-07-04 06:29:03.649[WARN ][sys-#321%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#321%DPL_GRID%DplGridNodeName% for timeout(ms)=15557
2019-07-04 06:29:03.650[WARN ][sys-#307%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#307%DPL_GRID%DplGridNodeName% for timeout(ms)=27938
2019-07-04 06:29:03.649[WARN ][sys-#316%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#316%DPL_GRID%DplGridNodeName% for timeout(ms)=12189
2019-07-04 06:29:03.649[WARN ][sys-#311%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#311%DPL_GRID%DplGridNodeName% for timeout(ms)=11056
2019-07-04 06:29:03.650[WARN ][sys-#295%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#295%DPL_GRID%DplGridNodeName% for timeout(ms)=20848
2019-07-04 06:29:03.649[WARN ][sys-#290%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#290%DPL_GRID%DplGridNodeName% for timeout(ms)=14816
2019-07-04 06:29:03.649[WARN ][sys-#332%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#332%DPL_GRID%DplGridNodeName% for timeout(ms)=14110
2019-07-04 06:29:03.649[WARN ][sys-#298%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#298%DPL_GRID%DplGridNodeName% for timeout(ms)=10028
2019-07-04 06:29:03.650[WARN ][sys-#304%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#304%DPL_GRID%DplGridNodeName% for timeout(ms)=19855
2019-07-04 06:29:03.650[WARN ][sys-#331%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#331%DPL_GRID%DplGridNodeName% for timeout(ms)=41277

... and so on
{code}



> Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-3195
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3195
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>            Reporter: Denis Magda
>            Assignee: Anton Vinogradov
>            Priority: Major
>              Labels: iep-16
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Presently it's considered that the maximum number of threads that has to process all demand and supply messages coming from all the nodes must not be bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> Current implementation relies on ordered messages functionality creating a number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> However, the implementation doesn't take into account that ordered messages, that correspond to a particular topic, are processed in parallel for different nodes. Refer to the implementation of {{GridIoManager.processOrderedMessage}} to see that for every topic there will be a unique {{GridCommunicationMessageSet}} for every node.
> Also to prove that this is true you can refer to this execution stack 
> {noformat}
> java.lang.RuntimeException: HAPPENED DEMAND
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> All this means that in fact the number of threads that will be busy with replication activity will be equal to {{IgniteConfiguration.rebalanceThreadPoolSize}} x number_of_nodes_participated_in_rebalancing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)