You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Yingjie Cao (Jira)" <ji...@apache.org> on 2022/07/25 06:40:00 UTC

[jira] [Assigned] (FLINK-28373) Read a full buffer of data per file IO read request for sort-shuffle

     [ https://issues.apache.org/jira/browse/FLINK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yingjie Cao reassigned FLINK-28373:
-----------------------------------

    Assignee: Yuxin Tan

> Read a full buffer of data per file IO read request for sort-shuffle
> --------------------------------------------------------------------
>
>                 Key: FLINK-28373
>                 URL: https://issues.apache.org/jira/browse/FLINK-28373
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>            Reporter: Yingjie Cao
>            Assignee: Yuxin Tan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>
> Currently, for sort blocking shuffle, the corresponding data readers read shuffle data in buffer granularity. Before compression, each buffer is 32K by default, after compression the size will become smaller (may less than 10K). For file IO, this is pretty smaller. To achieve better performance and reduce IOPS, we can read more data per IO read request and parse buffer header and data in memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)