You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Wojciech Indyk <wo...@gmail.com> on 2019/10/24 11:03:09 UTC

Guarantee of event-time order in FlinkKafkaConsumer

Hi!
I use Flink 1.8.0 with Kafka 2.2.1. I need to guarantee of correct order of
events by event timestamp. I generate periodic watermarks every 1s. I use
FlinkKafkaConsumer with AscendingTimestampExtractor.
The code (and the same question) is here:
https://stackoverflow.com/questions/58539379/guarantee-of-event-time-order-in-flinkkafkaconsumer

I realized, that for unordered events, that came in the same ms or a few ms
later, the order is not corrected by Flink. What I found in the docs: "the
watermark triggers computation of all windows where the maximum timestamp
(which is end-timestamp - 1) is smaller than the new watermark", so I added
a step of timeWindowAll with size of 100ms and inside that window I sort
messages by the event timestamp. It works, but I find this solution ugly
and it looks like a workaround. I am also concerned about per-partition
watermarks of KafkaSource.

Ideally I would like to put the guarantee of order in the KafkaSource and
keep it for each kafka partition, like per-partition watermarks. Is it
possible to do so? What is the current best solution for guarantee the
event-time order of events in Flink?

--
Kind regards/ Pozdrawiam,
Wojciech Indyk

Re: Guarantee of event-time order in FlinkKafkaConsumer

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Wojciech,

I posted an answer on StackOverflow.

Best, Fabian

Am Do., 24. Okt. 2019 um 13:03 Uhr schrieb Wojciech Indyk <
wojciechindyk@gmail.com>:

> Hi!
> I use Flink 1.8.0 with Kafka 2.2.1. I need to guarantee of correct order
> of events by event timestamp. I generate periodic watermarks every 1s. I
> use FlinkKafkaConsumer with AscendingTimestampExtractor.
> The code (and the same question) is here:
> https://stackoverflow.com/questions/58539379/guarantee-of-event-time-order-in-flinkkafkaconsumer
>
> I realized, that for unordered events, that came in the same ms or a few
> ms later, the order is not corrected by Flink. What I found in the docs:
> "the watermark triggers computation of all windows where the maximum
> timestamp (which is end-timestamp - 1) is smaller than the new watermark",
> so I added a step of timeWindowAll with size of 100ms and inside that
> window I sort messages by the event timestamp. It works, but I find this
> solution ugly and it looks like a workaround. I am also concerned about
> per-partition watermarks of KafkaSource.
>
> Ideally I would like to put the guarantee of order in the KafkaSource and
> keep it for each kafka partition, like per-partition watermarks. Is it
> possible to do so? What is the current best solution for guarantee the
> event-time order of events in Flink?
>
> --
> Kind regards/ Pozdrawiam,
> Wojciech Indyk
>