You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Roman Khachatryan (Jira)" <ji...@apache.org> on 2020/05/21 08:18:00 UTC

[jira] [Created] (FLINK-17861) Channel state handles, when inlined, duplicate underlying data

Roman Khachatryan created FLINK-17861:
-----------------------------------------

             Summary: Channel state handles, when inlined, duplicate underlying data
                 Key: FLINK-17861
                 URL: https://issues.apache.org/jira/browse/FLINK-17861
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing, Runtime / Task
    Affects Versions: 1.11.0
            Reporter: Roman Khachatryan
            Assignee: Roman Khachatryan
             Fix For: 1.11.0


When a subtask snapshots its state it creates one channelStateHandle per inputChannel/resultSubpartition. All handles of a single subtask share the underlying streamStateHandle. This is an optimisation to prevent having too many files.

But if streamStateHandle is inlined (size < state.backend.fs.memory-threshold) then most of the bytes in the underlying streamStateHandle are duplicated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)