You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Yu Li (JIRA)" <ji...@apache.org> on 2019/05/31 13:42:00 UTC
[jira] [Created] (FLINK-12699) Reduce CPU consumption when
snapshot/restore the spilled key-group
Yu Li created FLINK-12699:
-----------------------------
Summary: Reduce CPU consumption when snapshot/restore the spilled key-group
Key: FLINK-12699
URL: https://issues.apache.org/jira/browse/FLINK-12699
Project: Flink
Issue Type: Sub-task
Components: Runtime / State Backends
Reporter: Yu Li
Assignee: Yu Li
We need to prevent the unnecessary de/serialization when snapshotting/restoring the spilled state key-group. To achieve this, we need to:
1. Add meta information for {{HeapKeyedStatebackend}} checkpoint on DFS, separating the on-heap and on-disk part
2. Write the off-heap bytes directly to DFS when checkpointing and mark it as on-disk
3. Directly write the bytes onto disk when restoring the data back from DFS, if it's marked as on-disk
Notice that we cannot directly use file copy since we use mmap meanwhile support copy-on-write.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)