You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Weijie Guo (Jira)" <ji...@apache.org> on 2022/08/11 10:08:00 UTC
[jira] [Created] (FLINK-28925) Fix the concurrency problem in hybrid shuffle
Weijie Guo created FLINK-28925:
----------------------------------
Summary: Fix the concurrency problem in hybrid shuffle
Key: FLINK-28925
URL: https://issues.apache.org/jira/browse/FLINK-28925
Project: Flink
Issue Type: Bug
Components: Runtime / Network
Affects Versions: 1.16.0
Reporter: Weijie Guo
Fix For: 1.16.0
Through tpc-ds testing and code analysis, I found some thread unsafe problems in hybrid shuffle:
# HsSubpartitionMemeoryDataManager#consumeBuffer should return a readOnlySlice buffer to downstream instead of original buffer: If the spilling thread is processing while downstream task is consuming the same buffer, the amount of data written to the disk will be smaller than the actual value. To solve this, we should let the consuming thread and the spilling thread share the same data but not index.
# HsSubpartitionMemoryDataManager#releaseSubpartitionBuffers should ignore the release decision if the buffer already removed from bufferIndexToContexts instead of throw an exception.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)