You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yingjie Cao (Jira)" <ji...@apache.org> on 2022/07/21 10:17:00 UTC

[jira] [Assigned] (FLINK-28512) Select HashBasedDataBuffer and SortBasedDataBuffer dynamically based on the number of network buffers can be allocated for SortMergeResultPartition

     [ https://issues.apache.org/jira/browse/FLINK-28512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yingjie Cao reassigned FLINK-28512:
-----------------------------------

    Assignee: Yuxin Tan

> Select HashBasedDataBuffer and SortBasedDataBuffer dynamically based on the number of network buffers can be allocated for SortMergeResultPartition
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28512
>                 URL: https://issues.apache.org/jira/browse/FLINK-28512
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Yingjie Cao
>            Assignee: Yuxin Tan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>
> Currently, the SortMergeResultPartition select to use HashBasedDataBuffer and SortBasedDataBuffer based on the number of required buffers per result partition decided by 'taskmanager.network.sort-shuffle.min-buffers'. If the configured value is large enough, HashBasedDataBuffer will be used, otherwise, SortBasedDataBuffer will be used. Usually, the HashBasedDataBuffer has better performance. However, it is not easy to tune this value, because if a user tries to increase it for better performance, he/she is easy to encounter the 'Insufficient number of network buffers' error. This patch improves this case by selecting HashBasedDataBuffer and SortBasedDataBuffer dynamically based on the number of network buffers can be allocated. More specifically, if there is enough buffers at runtime, HashBasedDataBuffer will be used, otherwise, SortBasedDataBuffer will be used. To achieve better performance, the user only need to increase total amount of network memory per task manager.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)