You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Piotr Nowojski (Jira)" <ji...@apache.org> on 2022/03/21 15:05:00 UTC

[jira] [Updated] (FLINK-22887) Backlog based optimizations for RebalancePartitioner and RescalePartitioner (load rebalance)

     [ https://issues.apache.org/jira/browse/FLINK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Piotr Nowojski updated FLINK-22887:
-----------------------------------
    Summary: Backlog based optimizations for RebalancePartitioner and RescalePartitioner (load rebalance)  (was: Backlog based optimizations for RebalancePartitioner and RescalePartitioner)

> Backlog based optimizations for RebalancePartitioner and RescalePartitioner (load rebalance)
> --------------------------------------------------------------------------------------------
>
>                 Key: FLINK-22887
>                 URL: https://issues.apache.org/jira/browse/FLINK-22887
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.13.1
>            Reporter: Jiayi Liao
>            Priority: Major
>
> {\{RebalancePartitioner}} uses round-robin to distribute the records but this may not work as expected, because the environments and the processing ability of the downstream tasks may differ from each other. In such cases, the throughput of the whole job will be limited by the slowest downstream subtask, which is very similar with the "HASH" scenario.
> Instead, after the credit-based mechanism is introduced, we can leverage the {{backlog}} on the sender side to identify the "load" on each receiver side, which help us distribute the data more fairly in {{RebalancePartitioner}} and {{RescalePartitioner}}. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)