You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Mazen Ezzeddine <ma...@etu.unice.fr> on 2020/09/16 08:07:26 UTC

Connecting two streams and order of their processing

Hello all,

If an event is available now to Flink keyedcoprocess operator, and if
another event will be available 1 minute later to that operator (same key),
as a result of connecting the two streams, Flink does not provide any
guarantee that the event available now will be processed (processElement1)
before the event available 1 minute later (processElement2)? is that
accurate?

And if that is the case why Flink would do that maybe is counter intuitive.
Any technical limitations that would forces this out of order/time scenario?

Thanks again.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Connecting two streams and order of their processing

Posted by David Anderson <da...@alpinegizmo.com>.
The details of what can go wrong will vary depending on the precise
scenario, but no, Flink is unable to provide any such guarantee. Doing so
would require being able to control the scheduling of various threads
running on different machines, which isn't possible.

Of course, if event A becomes available to consume from one Kafka topic a
minute before an event B becomes available on some other topic (for
example), it's very unlikely that event A will experience so much latency
that a keyedcoprocess operator will receive B before A. But nothing
guarantees this is impossible. Lots of things can cause processing hiccups:
network contention, garbage collection, CPU load, redeployments, etc.

Regards,
David

On Wed, Sep 16, 2020 at 10:07 AM Mazen Ezzeddine <
mazen.ezzeddine@etu.unice.fr> wrote:

> Hello all,
>
> If an event is available now to Flink keyedcoprocess operator, and if
> another event will be available 1 minute later to that operator (same key),
> as a result of connecting the two streams, Flink does not provide any
> guarantee that the event available now will be processed (processElement1)
> before the event available 1 minute later (processElement2)? is that
> accurate?
>
> And if that is the case why Flink would do that maybe is counter intuitive.
> Any technical limitations that would forces this out of order/time
> scenario?
>
> Thanks again.
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: Connecting two streams and order of their processing

Posted by Jaswin Shah <ja...@outlook.com>.
With Keyed dual stream processing, you make sure that events for same key to processElement 1 and 2 are received to same partition. However, when you receive an event in processElement1, you should store that in flinks state so that if an another event arrives on delay to processElement2, you can do your computations/joining of two events on same key from two streams. In this way, it can handle the largest delay.
________________________________
From: Mazen Ezzeddine <ma...@etu.unice.fr>
Sent: 16 September 2020 13:37
To: user@flink.apache.org <us...@flink.apache.org>
Subject: Connecting two streams and order of their processing

Hello all,

If an event is available now to Flink keyedcoprocess operator, and if
another event will be available 1 minute later to that operator (same key),
as a result of connecting the two streams, Flink does not provide any
guarantee that the event available now will be processed (processElement1)
before the event available 1 minute later (processElement2)? is that
accurate?

And if that is the case why Flink would do that maybe is counter intuitive.
Any technical limitations that would forces this out of order/time scenario?

Thanks again.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/