You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/07/21 08:39:00 UTC

[jira] [Updated] (FLINK-28551) Store the number of bytes instead of the number of buffers in index entry for sort-shuffle

     [ https://issues.apache.org/jira/browse/FLINK-28551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated FLINK-28551:
-----------------------------------
    Labels: pull-request-available  (was: )

> Store the number of bytes instead of the number of buffers in index entry for sort-shuffle
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28551
>                 URL: https://issues.apache.org/jira/browse/FLINK-28551
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>            Reporter: Yingjie Cao
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>
> Currently, in each index entry of sort-shuffle index file, one filed is the number of buffers in the current data region. The problem is that it is hard to know the data boundary before reading the file, to solve the problem, we can store the number of bytes instead of the number of buffers in index entry. Based on this change, we can do some optimization, for example, read larger size of data than a buffer for better sequential IO like what's mentioned in FLINK-28373.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)