You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Hao Sun <ha...@zendesk.com> on 2017/12/01 06:21:33 UTC

Re: [EXTERNAL] difference between checkpoints & savepoints

Hi team, I have one follow up question on this.

There is a discussion on resuming jobs from *a saved external checkpoint*,
I feel there are two aspects of that topic.
*1. I do not have changes to the job, just want to resume the job from a
failure.*
I can see this automatically happen with ZK enabled. I do not have to
manually do anything.
======
2017-12-01 05:02:26,603 DEBUG
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore -
Recovering job graph f824eabe58d180d79416d9637ac6aa32 from
fraud_prevention_service/flink/jobgraphs/f824eabe58d180d79416d9637ac6aa32.
======

*2. I want to submit a new job and resume the previous process for whatever
reason. e.g. JobGraph changed, need to change parallelism, etc.*
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/savepoints.html#faq
I am wondering for Flink 1.3.2, 1.4 and 1.5, does external checkpoint
identical to savepoint? Does it mean everything in the FAQ section, also
applies to the externalized checkpoint? *How about allowNonRestoredState,
do we have things like this for externalized chkpnt?*

I am running Flink 1.3.2 on K8S, so I am wondering what is the best
practice to do the deployment for new code releases. And Flip6 is awesome,
can't wait to use it.

Thanks as always.


On Wed, Aug 16, 2017 at 5:23 PM Raja.Aravapalli <Ra...@target.com>
wrote:

>
>
> Thanks very much for the detailed explanation Stefan.
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Stefan Richter <s....@data-artisans.com>
> *Date: *Monday, August 14, 2017 at 7:47 AM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: [EXTERNAL] difference between checkpoints & savepoints
>
>
>
> Just noticed that I forgot to include also a reference to the
> documentation about externalized checkpoints:
> https://ci.apache.org/projects/flink/flink-docs-master/ops/state/checkpoints.html
> <https://ci.apache.org/projects/flink/flink-docs-master/ops/state/checkpoints.html>
>
>
>
> Am 14.08.2017 um 14:17 schrieb Stefan Richter <s.richter@data-artisans.com
> >:
>
>
>
>
>
> Hi,
>
>
>
>
>
> Also, in the same line, can someone detail the difference between State
> Backend & External checkpoint?
>
>
>
>
>
> Those are two very different things. If we talk about state backends in
> Flink, we mean the entity that is responsible for storing and managing the
> state inside an operator. This could for example be something like the
> FsStateBackend that is based on hash maps and keeps state on the heap, or
> the RocksDBStateBackend which is using RocksDB as a store internally and
> operates on native memory and disk.
>
>
>
> An externalized checkpoint, like a normal checkpoint, is the collection of
> all state in a job persisted to stable storage for recovery. A little more
> concrete, this typically means writing out the contents of the state
> backends to a save place so that we can restore them from there.
>
>
>
> Also, programmatic API, thru which methods we can configure those.
>
>
>
> This explains how to set the backend programatically:
>
>
>
>
> https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html
> <https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html>
>
>
>
> To activate externalized checkpoints, you activate normal checkpoints,
> plus the following line:
>
>
>
> env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.*RETAIN_ON_CANCELLATION*);
>
>
>
> where env is your StreamExecutionEnvironment.
>
>
>
> If you need an example, please take a look at the
> org.apache.flink.test.checkpointing.ExternalizedCheckpointITCase. This
> class configures everything you asked about programatically.
>
>
>
> Best,
>
> Stefan
>
>
>
>
>