You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kostas Kloudas (JIRA)" <ji...@apache.org> on 2018/03/16 09:13:00 UTC

[jira] [Commented] (FLINK-5601) Window operator does not checkpoint watermarks

    [ https://issues.apache.org/jira/browse/FLINK-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401627#comment-16401627 ] 

Kostas Kloudas commented on FLINK-5601:
---------------------------------------

This is an important and known issue.

It was also encountered during the implementation of the triggerDSL (which was not finally merged).

Back then we did not have union state. I think that now, the TimestampsAndPeriodicWatermarksOperator or the source (whoever has the watermark emitter) should checkpoint its watermark in a union state, and upon restoring/rescaling, its task should take its previously checkpointed watermark, and in case of scaling up, the new tasks should take the minimum of the previously checkpointed watermarks.

 

What do you think?

> Window operator does not checkpoint watermarks
> ----------------------------------------------
>
>                 Key: FLINK-5601
>                 URL: https://issues.apache.org/jira/browse/FLINK-5601
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Priority: Major
>
> During release testing [~stefanrichter83@gmail.com] and I noticed that watermarks are not checkpointed in the window operator.
> This can lead to non determinism when restoring checkpoints. I was running an adjusted {{SessionWindowITCase}} via Kafka for testing migration and rescaling and ran into failures, because the data generator required determinisitic behaviour.
> What happened was that on restore it could happen that late elements were not dropped, because the watermarks needed to be re-established after restore first.
> [~aljoscha] Do you know whether there is a special reason for explicitly not checkpointing watermarks?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)