You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/24 03:18:00 UTC

[jira] [Commented] (FLINK-8790) Improve performance for recovery from incremental checkpoint

    [ https://issues.apache.org/jira/browse/FLINK-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488360#comment-16488360 ] 

ASF GitHub Bot commented on FLINK-8790:
---------------------------------------

Github user sihuazhou commented on the issue:

    https://github.com/apache/flink/pull/5582
  
    Unfortunately, after confirming with RocksDB, the `deleteRange()` is still an experimental feature, it may have impact on read performance currently(event thought we could use the ReadOption to reduce the impaction).
    
    In practice, I tested the impact of read performance of `deleteRange()` in our case (only delete 2 ranges at most), I didn't find any impact in fact. And the TiKV has already used it to delete entire shards. But, to be on the safe side, I think the current PR should be frozen, but I think the implementation base on `deleteRange()` in this PR should be a better implementation(especially when user scaling up the job, in that case we only need to clip the RocksDB without iterating any records, a super fast way) if the `deleteRange()` is no longer a feature of experimental.
    
    Anyways, even although we can't use the `deleteRange()` currently, but we can still improve the performance of the incremental checkpoint in somehow. We can improve it the by the follow way: if one of the state handle's key-group is a sub-range of the target key-group range. we can open it directly to prevent the overhead of iterating it. @StefanRRichter What do you think? If you don't object this, I will update the PR follow the above approach.


> Improve performance for recovery from incremental checkpoint
> ------------------------------------------------------------
>
>                 Key: FLINK-8790
>                 URL: https://issues.apache.org/jira/browse/FLINK-8790
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Major
>             Fix For: 1.6.0
>
>
> When there are multi state handle to be restored, we can improve the performance as follow:
> 1. Choose the best state handle to init the target db
> 2. Use the other state handles to create temp db, and clip the db according to the target key group range (via rocksdb.deleteRange()), this can help use get rid of the `key group check` in 
>  `data insertion loop` and also help us get rid of traversing the useless record.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)