You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Weijie Guo (Jira)" <ji...@apache.org> on 2022/08/09 09:50:00 UTC
[jira] [Created] (FLINK-28889) Hybrid shuffle writes multiple copies of broadcast data
Weijie Guo created FLINK-28889:
----------------------------------
Summary: Hybrid shuffle writes multiple copies of broadcast data
Key: FLINK-28889
URL: https://issues.apache.org/jira/browse/FLINK-28889
Project: Flink
Issue Type: Bug
Reporter: Weijie Guo
Hybrid shuffle writes multiple copies of broadcast data, This will cause a waste of memory and disk space and affect the performance of shuffle write phase. Ideally, for the full spilling strategy, any broadcast data (record or event) should only write one piece of data in the memory, and the same is true for the disk. For selective spilling strategy, if the broadcast edge is encountered, we should consider directly turning it into the edge of HYBRID_FULL, or introducing configuration option to decide whether to do this switch.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)