You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Sihua Zhou (JIRA)" <ji...@apache.org> on 2018/02/27 02:27:00 UTC

[jira] [Comment Edited] (FLINK-8753) Introduce Incremental savepoint

    [ https://issues.apache.org/jira/browse/FLINK-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377908#comment-16377908 ] 

Sihua Zhou edited comment on FLINK-8753 at 2/27/18 2:26 AM:
------------------------------------------------------------

[~StephanEwen] Thanks for your reply. Indeed, what I am trying to achieve is just a faster savepoint that does not need to iterate all records one by one (along with some condition check that make it slow for huge data). And yes what you are described is very close to what I wanted but I didn't use the word `checkpoint` is that: checkpoint doesn't guarantee to support rescaling (this can be found on [flink-doc|https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/checkpoints.html#difference-to-savepoints] and the comment in this PR [5490|https://github.com/apache/flink/pull/5490]), which is always the purpose that we trigger a savepoint. An interesting thing I found is that, in the implementation checkpoint also support rescaling, I checked that both in code and in practice ... I wonder whether the "archive checkpoint" that you mentioned guarantee to support rescaling? 

At bout the implementation, I think maybe this issue's title is incorrect ... I just want to implement the savepoint which go though the incremental checkpoint path but treat the `baseSstFile` as empty ( which is look like just submit the local RocksDB snapshot on to DFS)...


was (Author: sihuazhou):
[~StephanEwen] Thanks for your reply. Indeed, what I am trying to achieve is just a faster savepoint that does not  to iterate all records one by one (along with some condition check that make it slow for huge data). And yes what you are described is very close to what I wanted but I didn't use the word `checkpoint` is that: checkpoint doesn't guarantee to support rescaling (this can be found on [flink-doc|https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/checkpoints.html#difference-to-savepoints] and the comment in this PR [5490|https://github.com/apache/flink/pull/5490]), which is always the purpose that we trigger a savepoint. An interesting thing I found is that, in the implementation checkpoint also support rescaling, I checked that both in code and in practice ... I wonder whether the "archive checkpoint" guarantee to support rescaling? 

At bout the implementation, I think maybe this issue's title incorrect ... I just want to implement the save point which go though the incremental checkpoint path but treat the `baseSstFile` as empty ( which is look like just submit the local RocksDB snapshot on to DFS).

> Introduce Incremental savepoint
> -------------------------------
>
>                 Key: FLINK-8753
>                 URL: https://issues.apache.org/jira/browse/FLINK-8753
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Major
>
> Right now, savepoint goes through the full checkpoint path, take a savepoint could be slowly. In our production, for some long term job it often costs more than 10min to complete a savepoint which is unacceptable for a real time job, so we have to turn back to use the externalized checkpoint instead currently. But the externalized  checkpoint has a time interval (checkpoint interval) between the last time. So I proposal to introduce the increment savepoint which goes through the increment checkpoint path.
> Any advice would be appreciated!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)