You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kostas Kloudas (JIRA)" <ji...@apache.org> on 2019/01/15 13:33:00 UTC

[jira] [Closed] (FLINK-11337) Incorrect watermark in StreaminFileSink BucketAssigner.Context when used in connected stream

     [ https://issues.apache.org/jira/browse/FLINK-11337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kostas Kloudas closed FLINK-11337.
----------------------------------
    Resolution: Won't Fix

> Incorrect watermark in StreaminFileSink BucketAssigner.Context when used in connected stream
> --------------------------------------------------------------------------------------------
>
>                 Key: FLINK-11337
>                 URL: https://issues.apache.org/jira/browse/FLINK-11337
>             Project: Flink
>          Issue Type: Bug
>          Components: filesystem-connector
>    Affects Versions: 1.7.0
>            Reporter: Edward Rojas
>            Priority: Major
>
> When StreamingFileSink is used as sink of a connected stream the "invoke" method of the sink could be called before the "combinedWatermark" is updated with the timestamp of the element currently being processed, resulting on the context containing the incorrect watermark value (the Long.MIN_VALUE when using "AssignerWithPeriodicWatermarks" for the firsts events in the stream). 
> I reproduce this when using a broadcast stream connected to a data stream. The broadcast stream is using a custom timestamp extractor that always return the Watermark.MAX_VALUE as it's done in a trining example here: [https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/solutions/datastream_java/broadcast/OngoingRidesSolution.java#L143.]
> This is problematic as the watermark could not be used reliably to compute the bucket id based on event time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)