You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Lukasz Cwik <lc...@google.com> on 2019/08/05 17:04:29 UTC

Re: How to deal with failed Checkpoint? What is current behavior for subsequent checkpoints?

https://s.apache.org/beam-finalizing-bundles should give you a bunch more
details but I replied inline to your questions as well.

On Fri, Jul 19, 2019 at 10:40 AM Ken Barr <Ke...@solace.com> wrote:

> Reading the below two statements I conclude that CheckpointMark.finalize
> Checkpoint() will be called in order, unless there is a failure.
>
Correct.


> What happens in a failure?
>
Since the checkpoint is not finalized, and if it is never successful it is
expected that the source will eventually replay the same output. For
example, you are reading from a message queue and ack the message ids of
read messages during finalizeCheckpoint. If finalizeCheckpoint fails or is
never called, the messages are never acked and it is up to the message
queue to redeliver them.


> What happens to subsequent checkpoints in the case of a checkpoint failure?
>
It is up to the runner to decide to try the checkpoint again or discard it.
Either way the intent is that if the checkpoint never succeeded, the source
will reproduce that output.

How do I prevent event re-ordering in the case of a checkpoint failure?
>
There is no support for ordering in Apache Beam since it is intended to be
processed at scale and hence we want to reduce the coordination required
across multiple machines.

Here are two threads about ordering, some ideas about how it could be
solved and a lot about why it is non trivial to have good performance:
https://lists.apache.org/thread.html/479e090f5a7fe8c66ba88406a61eba2968fb7f3de965451727046a0f@%3Cdev.beam.apache.org%3E
https://lists.apache.org/thread.html/6d843efbc18b226570660b83e96c36c4d38a2c63940a87d1abfaf2f7@%3Cdev.beam.apache.org%3E


>
> There is no rewind method and I cannot find a way to tell the
> UnboundedReader to rewind.   If I look at the pubsub UnboundedSource it
> seems to try and deal with the condition where the finalize itself fails,
> but not a failure elsewhere in the Checkpointing.  The only time a rewind,
> nackAll() is called is when a new reader is created.  If I have an
> UnboundedSource that is capable of rewinding to the last valid checkpoint
> and replaying events, how could I indicate to it to do so in the case of a
> Checkpoint failure that occurs outside the UnboundedSource?
>
>
>
>
> https://beam.apache.org/releases/javadoc/2.8.0/org/apache/beam/sdk/io/UnboundedSource.CheckpointMark.html
>
> “In the absence of failures, all checkpoints will be finalized and they
> will be finalized in the same order they were taken from the
> UnboundedSource.UnboundedReader
> <https://beam.apache.org/releases/javadoc/2.8.0/org/apache/beam/sdk/io/UnboundedSource.UnboundedReader.html>
> .”
>
> “It is possible for a checkpoint to be taken but this method never called.
> This method will never be called if the checkpoint could not be committed,
> and other failures may cause this method to never be called.”
>
>
>