You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Eleanore Jin <el...@gmail.com> on 2020/08/10 15:56:58 UTC

Cannot resume from savepoint

Hi experts,

I am using Beam 2.23.0, with flink runner 1.8.2. I am trying to explore
when enabling checkpoint and kafkaIO EOS, different scenarios to resume a
job from a savepoint. I am running Kafka and a standalone flink cluster
locally on my laptop.

Below are the scenarios that I have tried out:

1. Add a new topic as source
Before savepoint: read from input1 and write to output
Take a savepoint
After savepoint: read from input1 and input2 and write to output
Behavior: It did not output messages from input2

2. Remove a topic source
Before savepoint: read from input1 and input2 and write to output
Take a savepoint
After savepoint: read from input1 and write to output
Behavior: work as expected, only output messages from input1

3. Add a topic as sink
Before savepoint: read from input1 and write to output1
Take a savepoint
After savepoint: read from input1 and write to output1 and output2
Behavior: pipeline failed with exception
[image: image.png]

4. Remove a topic sink
Before savepoint: read from input1 and write to output1 and output2
Take a savepoint
After savepoint: read from input1 and write to output1
Behavior: It requires to change the sinkGroupId, otherwise get exception
[image: image.png]

So it looks like resume from savepoint does not really work when there is a
change in the DAG for source or sink, I wonder if this is expected
behaviour? Is this something to do with how KafkaIO EOS state works or is
it something that is related to the flink runner?

Thanks a lot!
Eleanore