You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by sihuazhou <gi...@git.apache.org> on 2018/05/24 03:17:27 UTC

[GitHub] flink issue #5582: [FLINK-8790][State] Improve performance for recovery from...

Github user sihuazhou commented on the issue:

    https://github.com/apache/flink/pull/5582
  
    Unfortunately, after confirming with RocksDB, the `deleteRange()` is still an experimental feature, it may have impact on read performance currently(event thought we could use the ReadOption to reduce the impaction).
    
    In practice, I tested the impact of read performance of `deleteRange()` in our case (only delete 2 ranges at most), I didn't find any impact in fact. And the TiKV has already used it to delete entire shards. But, to be on the safe side, I think the current PR should be frozen, but I think the implementation base on `deleteRange()` in this PR should be a better implementation(especially when user scaling up the job, in that case we only need to clip the RocksDB without iterating any records, a super fast way) if the `deleteRange()` is no longer a feature of experimental.
    
    Anyways, even although we can't use the `deleteRange()` currently, but we can still improve the performance of the incremental checkpoint in somehow. We can improve it the by the follow way: if one of the state handle's key-group is a sub-range of the target key-group range. we can open it directly to prevent the overhead of iterating it. @StefanRRichter What do you think? If you don't object this, I will update the PR follow the above approach.


---