You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Tang (Jira)" <ji...@apache.org> on 2022/07/27 08:47:00 UTC

[jira] [Commented] (FLINK-28699) Native rocksdb full snapshot in non-incremental checkpointing

    [ https://issues.apache.org/jira/browse/FLINK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571805#comment-17571805 ] 

Yun Tang commented on FLINK-28699:
----------------------------------

I think this is doable and bring performance benefits for users with default configurations, especially considering that we cannot make `state.backend.incremental` as true in the coming flink-1.16.

[~frozen stone] Please make sure all uploaded sst files in each checkpoint should only stay in the exclusive scoped checkpoint folder and do not touch the shared state registry in the new full checkpoint strategy.

I think this change would not impact the current changelog state-backend design and implementations. cc [~roman] 

> Native rocksdb full snapshot in non-incremental checkpointing
> -------------------------------------------------------------
>
>                 Key: FLINK-28699
>                 URL: https://issues.apache.org/jira/browse/FLINK-28699
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.14.5, 1.15.1
>            Reporter: Lihe Ma
>            Priority: Major
>
> When rocksdb statebackend is used and state.backend.incremental enabled, flink will figure out newly created sst files generated by rocksdb during checkpoint, and read all the states from rocksdb and write to files during savepoint [1].
> When state.backend.incremental disabled, flink will read all the states from rocksdb and generate state files in checkpoint and savepoint [2]. This makes sense in savepoint, cause user can take a savepoint with rocksdb statebackend and then restore it using another statebackend, but in checkpoint, deserialisation and serialisation of state results in performance loss.
> If the native rocksdb snapshot is introduced in full snapshot, theoretically better performance can be achieved. At the same time, savepoint remains the same as before.
>  
>  # https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java
>  # https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksFullSnapshotStrategy.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)