You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Elias Levy <fe...@gmail.com> on 2017/11/03 00:11:52 UTC

Incremental checkpointing documentation

There doesn't appear to be much in the way of documentation for incremental
checkpointing other than how to turn it on.  That leaves a lot of questions
unanswered.

What is the interaction of incremental checkpointing and external
checkpoints?

Any interaction with the state.checkpoints.num-retained config?

Does incremental checkpointing require any maintenance?

Any interaction with savepoints?

Does it perform better against certain "file systems"?  E.g. it S3 not
recommended for it?  How about EFS?

Re: Incremental checkpointing documentation

Posted by Nico Kruber <ni...@data-artisans.com>.
Hi Elias,
let me answer the questions to the best of my knowledge, but in general I 
think this is as expected.
(Let me give a link to the docs explaining the activation [1] for other 
readers first.)

On Friday, 3 November 2017 01:11:52 CET Elias Levy wrote:
> What is the interaction of incremental checkpointing and external
> checkpoints?

Externalized checkpoints may be incremental [2] (I'll fix the formatting error 
that is not rendering the arguments as a list, making them less visible)

> Any interaction with the state.checkpoints.num-retained config?

Yes, this remains the number of available checkpoints. There may, however, be 
more folders containing RocksDB state that was originally put into checkpoint 
X but is also still required in checkpoint X+10 or so. These files will be 
cleaned up once they are not needed anymore.

> Does incremental checkpointing require any maintenance?

No, state is cleaned up once it is not used/referenced anymore.

> Any interaction with savepoints?

No, a savepoint uses Flink's own data format and is not incremental [3].

> Does it perform better against certain "file systems"?  E.g. it S3 not
> recommended for it?  How about EFS?

I can't think of a reason this should be any different to non-incremental 
checkpoints. Maybe Stefan (cc'd) has some more info on this.

For more details on the whole topic, I can recommend Stefan's talk at the last 
Flink Forward [4] though.


Nico


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
large_state_tuning.html#tuning-rocksdb
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
checkpoints.html#difference-to-savepoints
[3] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/
savepoints.html
[4] https://www.youtube.com/watch?
v=dWQ24wERItM&index=36&list=PLDX4T_cnKjD0JeULl1X6iTn7VIkDeYX_X