You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Peter Schrott <pe...@bluerootlabs.io> on 2022/08/18 14:45:04 UTC

Overwriting watermarks in DataStream

Hi there,

While still struggling with events and watermarks out of order after sorting with a buffer process function (compare [1]) I tired to solve the issue by assigning a new watermark after the mentioned sorting function.

The Flink docs [2] are not very certain about the impact of assigning additional watermarks downstream: "If the original stream had timestamps and/or watermarks already, the timestamp assigner overwrites them.” 

Does it overwrite the watermark from the point in the stream where its assigned or entirely also upstream?

Thanks in advance
Peter

[1] https://lists.apache.org/thread/wwvpg2qk5v3lb5pxhn4hhkt0xkygg9f3
[2] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/event-time/generating_watermarks/#using-watermark-strategies


Re: Overwriting watermarks in DataStream

Posted by Peter Schrott <pe...@bluerootlabs.io>.
Hi David,

Thanks a lot for clarification.

Best, Peter


> On 21. Aug 2022, at 18:36, David Anderson <da...@apache.org> wrote:
> 
> If you have two watermark strategies in your job, the downstream TimestampsAndWatermarksOperator will absorb incoming watermarks and not forward them downstream, but it will have no effect upstream. 
> 
> The only exception to this is that watermarks equal to Long.MAX_VALUE are forwarded downstream, since they are used to signal the end of input.
> 
> David
> 
> [1] https://github.com/apache/flink/blob/release-1.15/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/TimestampsAndWatermarksOperator.java#L120 <https://github.com/apache/flink/blob/release-1.15/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/TimestampsAndWatermarksOperator.java#L120>
> On Thu, Aug 18, 2022 at 8:45 AM Peter Schrott <peter@bluerootlabs.io <ma...@bluerootlabs.io>> wrote:
> Hi there,
> 
> While still struggling with events and watermarks out of order after sorting with a buffer process function (compare [1]) I tired to solve the issue by assigning a new watermark after the mentioned sorting function.
> 
> The Flink docs [2] are not very certain about the impact of assigning additional watermarks downstream: "If the original stream had timestamps and/or watermarks already, the timestamp assigner overwrites them.” 
> 
> Does it overwrite the watermark from the point in the stream where its assigned or entirely also upstream?
> 
> Thanks in advance
> Peter
> 
> [1] https://lists.apache.org/thread/wwvpg2qk5v3lb5pxhn4hhkt0xkygg9f3 <https://lists.apache.org/thread/wwvpg2qk5v3lb5pxhn4hhkt0xkygg9f3>
> [2] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/event-time/generating_watermarks/#using-watermark-strategies <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/event-time/generating_watermarks/#using-watermark-strategies>
> 


Re: Overwriting watermarks in DataStream

Posted by David Anderson <da...@apache.org>.
If you have two watermark strategies in your job, the
downstream TimestampsAndWatermarksOperator will absorb incoming watermarks
and not forward them downstream, but it will have no effect upstream.

The only exception to this is that watermarks equal to Long.MAX_VALUE are
forwarded downstream, since they are used to signal the end of input.

David

[1]
https://github.com/apache/flink/blob/release-1.15/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/TimestampsAndWatermarksOperator.java#L120

On Thu, Aug 18, 2022 at 8:45 AM Peter Schrott <pe...@bluerootlabs.io> wrote:

> Hi there,
>
> While still struggling with events and watermarks out of order after
> sorting with a buffer process function (compare [1]) I tired to solve the
> issue by assigning a new watermark after the mentioned sorting function.
>
> The Flink docs [2] are not very certain about the impact of assigning
> additional watermarks downstream: "If the original stream had timestamps
> and/or watermarks already, the timestamp assigner overwrites them.”
>
> Does it overwrite the watermark from the point in the stream where its
> assigned or entirely also upstream?
>
> Thanks in advance
> Peter
>
> [1] https://lists.apache.org/thread/wwvpg2qk5v3lb5pxhn4hhkt0xkygg9f3
> [2]
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/event-time/generating_watermarks/#using-watermark-strategies
>
>