You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by "LINZ, Arnaud" <AL...@bouyguestelecom.fr> on 2016/03/14 13:14:28 UTC

TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

Hello,

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my unit tests.

To narrow the problem, here is what I’ve figured out:


-          I use a simple Streaming application with a source defined as “fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))

-          I use an apply() function and counts the total number of elements I get with a global counter

With the previous version, I got all three elements because, not because they are  triggered under 3 seconds, but because the source ends
With the 1.0 version, I don’t get any elements, and that’s annoying because as the source ends the application ends even if I sleep 5 seconds after the execute() method.

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3) with Time.millisecond(1), I get a random number of elements)

Is this behavior wanted ? If yes, how do I get my last elements now ?

Best regards,
Arnaud




________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.

RE: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

Posted by "LINZ, Arnaud" <AL...@bouyguestelecom.fr>.

Hi,

All right… I find this new behavior dangerous since you’ll always miss the last elements of a source that does not last forever if you use processing time windows.
I’ve created a source wrapper that sleeps at the end of the last element so that unit test that use processing time work.

Cheers,
Arnaud


De : Till Rohrmann [mailto:trohrmann@apache.org]
Envoyé : lundi 14 mars 2016 15:11
À : user@flink.apache.org
Objet : Re: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1


Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite stream was slightly changed. If you use event time, then all unfinished windows are triggered in case that your stream ends. This can be motivated by the fact that the end of a stream is equivalent to no elements will arrive until the maximum time (infinity) has been reached. This knowledge, allows you to emit a Long.MaxValue watermark when an event time stream is finished, which will trigger all lingering windows.

In contrast to event time, you cannot say the same about a finished processing time stream. There we don’t have logical time but the actual processing time we use to reason about windows. When a stream finishes, then we cannot fast forward the processing time to a point where the windows will fire. This can only happen if we keep the operators alive until the wall clock tells us that it’s time to fire the windows. However, there is no such feature implemented yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till


On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud <AL...@bouyguestelecom.fr>> wrote:
Hello,

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my unit tests.

To narrow the problem, here is what I’ve figured out:


-          I use a simple Streaming application with a source defined as “fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))

-          I use an apply() function and counts the total number of elements I get with a global counter

With the previous version, I got all three elements because, not because they are  triggered under 3 seconds, but because the source ends
With the 1.0 version, I don’t get any elements, and that’s annoying because as the source ends the application ends even if I sleep 5 seconds after the execute() method.

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3) with Time.millisecond(1), I get a random number of elements)

Is this behavior wanted ? If yes, how do I get my last elements now ?

Best regards,
Arnaud




________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.

Re: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

Posted by Till Rohrmann <tr...@apache.org>.

Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite
stream was slightly changed. If you use event time, then all unfinished
windows are triggered in case that your stream ends. This can be motivated
by the fact that the end of a stream is equivalent to no elements will
arrive until the maximum time (infinity) has been reached. This knowledge,
allows you to emit a Long.MaxValue watermark when an event time stream is
finished, which will trigger all lingering windows.

In contrast to event time, you cannot say the same about a finished
processing time stream. There we don’t have logical time but the actual
processing time we use to reason about windows. When a stream finishes,
then we cannot fast forward the processing time to a point where the
windows will fire. This can only happen if we keep the operators alive
until the wall clock tells us that it’s time to fire the windows. However,
there is no such feature implemented yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till

On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud <AL...@bouyguestelecom.fr>
wrote:

> Hello,
>
>
>
> I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression
> in some  of my unit tests.
>
>
>
> To narrow the problem, here is what I’ve figured out:
>
>
>
> -          I use a simple Streaming application with a source defined as
> “fromElements("Element 1", "Element 2", "Element 3")
>
> -          I use a simple time window function with a 3 second window :
> timeWindowAll(Time.seconds(3))
>
> -          I use an apply() function and counts the total number of
> elements I get with a global counter
>
>
>
> With the previous version, I got all three elements because, not because
> they are  triggered under 3 seconds, but because the source ends
>
> With the 1.0 version, I don’t get any elements, and that’s annoying
> because as the source ends the application ends even if I sleep 5 seconds
> after the execute() method.
>
>
>
> (If I replace fromElement with fromCollection with a 10000 element list
> and Time.second(3) with Time.millisecond(1), I get a random number of
> elements)
>
>
>
> Is this behavior wanted ? If yes, how do I get my last elements now ?
>
>
>
> Best regards,
>
> Arnaud
>
>
>
>
>
>
>
> ------------------------------
>
> L'intégrité de ce message n'étant pas assurée sur internet, la société
> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le détruire et
> d'avertir l'expéditeur.
>
> The integrity of this message cannot be guaranteed on the Internet. The
> company that sent this message cannot therefore be held liable for its
> content nor attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of this message, then
> please delete it and notify the sender.
>