You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/05/02 07:18:04 UTC

[jira] [Commented] (FLINK-6364) Implement incremental checkpointing in RocksDBStateBackend

    [ https://issues.apache.org/jira/browse/FLINK-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992469#comment-15992469 ] 

ASF GitHub Bot commented on FLINK-6364:
---------------------------------------

Github user shixiaogang commented on the issue:

    https://github.com/apache/flink/pull/3801
  
    Hi @gyfora I am very happy to hear from you. The following are the answers to your questions. Kindly let me know if you have any idea of them.
    
    1. The incremental checkpoints supports rescaling. It's true that the implementation checkpoints files directly for multiple key groups together. But in the cases where the degree of parallelism changes, the files will be passed to all the state backends whose key groups are in the files. Then the backends will iterate over all the key-value pairs in the files and pick up those kv pairs that belong to them.
    
    2.  In the cases we restore from a full snapshot (which is formatted as key-value pairs), the next incremental checkpoint will contain all the files. It may seem a little bit inefficient because i intend to make each checkpoint self-contained. Given that full snapshots and incremental snapshots are in different formats, we have to take a "full" incremental snapshot as the base for following checkpoints.
    
    3. That is a very good question. It will be flexible that users can choose the scheme of checkpoints (say one full checkpoint after n incremental checkpoints).  But i think making every checkpoint incremental is acceptable because incremental checkpoints are more  efficient in most cases. Those backends which do not support incremental checkpointing can still take full snapshotting.


> Implement incremental checkpointing in RocksDBStateBackend
> ----------------------------------------------------------
>
>                 Key: FLINK-6364
>                 URL: https://issues.apache.org/jira/browse/FLINK-6364
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>            Reporter: Xiaogang Shi
>            Assignee: Xiaogang Shi
>
> {{RocksDBStateBackend}} is well suited for incremental checkpointing because RocksDB is base on LSM trees,  which record updates in new sst files and all sst files are immutable. By only materializing those new sst files, we can significantly improve the performance of checkpointing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)