You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yingjie Cao (Jira)" <ji...@apache.org> on 2022/01/24 12:00:00 UTC

[jira] [Updated] (FLINK-25774) Restrict the maximum number of buffers can be used per result partition for sort-shuffle

     [ https://issues.apache.org/jira/browse/FLINK-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yingjie Cao updated FLINK-25774:
--------------------------------
    Summary: Restrict the maximum number of buffers can be used per result partition for sort-shuffle  (was: Restrict the maximum number of buffers can be used per result partition for blocking shuffle)

> Restrict the maximum number of buffers can be used per result partition for sort-shuffle
> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-25774
>                 URL: https://issues.apache.org/jira/browse/FLINK-25774
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>            Reporter: Yingjie Cao
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Currently, for blocking shuffle, the maximum number of buffers can be used per result partition is Integer.MAX_VALUE. For hash-shuffle, the maximum number of buffers to be used is (numSubpartition + 1), because the hash-shuffle implementation always flush the previous buffer after a new buffer is added, so setting the maximum number of buffers can be used to Integer.MAX_VALUE is meaningless. For sort-shuffle, if too many buffers are taken by one result partition, other result partitions and input gates may spend too much time waiting for buffers which can influence performance. This ticket aims to restrict the maximum number of buffers can be used per result partition and the selected value is an empirical one based on the TPC-DS test results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)