You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Jiang Xin (Jira)" <ji...@apache.org> on 2024/01/02 06:12:00 UTC

[jira] [Updated] (FLINK-33954) Large record may cause the hybrid shuffle hang

     [ https://issues.apache.org/jira/browse/FLINK-33954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jiang Xin updated FLINK-33954:
------------------------------
    Description: 
In some cases, the job may hang when there are not enough buffers in the local buffer pool. For instance, the parallelism is 10, so the HashBufferAccumulator is used. The size of local buffer pool is parallelism + 1

1. The local buffer pool size can be very small when the parallelism is small. So when a large record comes and it needs more buffers than the buffer pool has, a hang would happen.

  was:The local buffer pool size can be very small when the parallelism is small. So when a large record comes and it needs more buffers than the buffer pool has, a hang would happen.


> Large record may cause the hybrid shuffle hang
> ----------------------------------------------
>
>                 Key: FLINK-33954
>                 URL: https://issues.apache.org/jira/browse/FLINK-33954
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>            Reporter: Jiang Xin
>            Priority: Major
>
> In some cases, the job may hang when there are not enough buffers in the local buffer pool. For instance, the parallelism is 10, so the HashBufferAccumulator is used. The size of local buffer pool is parallelism + 1
> 1. The local buffer pool size can be very small when the parallelism is small. So when a large record comes and it needs more buffers than the buffer pool has, a hang would happen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)