You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Weijie Guo (Jira)" <ji...@apache.org> on 2022/07/21 09:54:00 UTC

[jira] [Updated] (FLINK-28623) Optimize the use of off heap memory by blocking and hybrid shuffle reader

     [ https://issues.apache.org/jira/browse/FLINK-28623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Weijie Guo updated FLINK-28623:
-------------------------------
    Description: Currently, each FileReader(PartitionFileReader or HsSubpartitionFileReaderImpl) will internally allocate a headerbuffer with the size of 8B. Beside, PartitionFileReader also has a 12B indexEntryBuf. Because FileReader is of subpartition granularity, If the parallelism becomes very big, and there are many slots on each TM, the memory occupation will even reach the MB level. In fact, all FileReader of the same ResultPartition read data in a single thread, so we only need to allocate a headerbuffer to a ResultPartition to optimize this phenomenon.  (was: Currently, each FileReader(PartitionFileReader or HsSubpartitionFileReaderImpl

) will internally allocate a headerbuffer with the size of 8B. Beside, PartitionFileReader also has a 12B indexEntryBuf. Because FileReader is of subpartition granularity, If the parallelism becomes very big, and there are many slots on each TM, the memory occupation will even reach the MB level. In fact, all FileReader of the same ResultPartition read data in a single thread, so we only need to allocate a headerbuffer to a ResultPartition to optimize this phenomenon.)

> Optimize the use of off heap memory by blocking and hybrid shuffle reader
> -------------------------------------------------------------------------
>
>                 Key: FLINK-28623
>                 URL: https://issues.apache.org/jira/browse/FLINK-28623
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>            Reporter: Weijie Guo
>            Priority: Minor
>
> Currently, each FileReader(PartitionFileReader or HsSubpartitionFileReaderImpl) will internally allocate a headerbuffer with the size of 8B. Beside, PartitionFileReader also has a 12B indexEntryBuf. Because FileReader is of subpartition granularity, If the parallelism becomes very big, and there are many slots on each TM, the memory occupation will even reach the MB level. In fact, all FileReader of the same ResultPartition read data in a single thread, so we only need to allocate a headerbuffer to a ResultPartition to optimize this phenomenon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)