You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Demin Alexey <di...@gmail.com> on 2016/11/06 12:31:58 UTC
Timer and Window behavior
Hi
I read Unbound stream (read from kafka) and grouped by value,
but on low-throughput streams I have strange behavior:
stream.apply(Window.into(FixedWindows.of(Duration.millis(10)))).apply(GroupByKey.create())
or
stream.apply(
Window.into(FixedWindows.of(Duration.millis(10)))
.triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
.apply(GroupByKey.create())
1ms event1
3ms event2
11ms event3 - (triger window)
12ms event4
13ms event5
21ms event6 - (triger window)
22ms event7
23ms event8
<nothing next 5 min>
5m00ms event9
As result event7 and event8 stay in windows without processing next 5 min
Window and GroupBy will create only on event9
Behavior can reproduce on DirectRunner and FlinkRunner.
This is bug or incorrect using API from my side?
Re: Timer and Window behavior
Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Demin,
I remember to have seen an improvement about watermark in KafkaIO
(BEAM-591).
I advise you to take a look there.
Regards
JB
On 11/06/2016 01:31 PM, Demin Alexey wrote:
> Hi
>
> I read Unbound stream (read from kafka) and grouped by value,
> but on low-throughput streams I have strange behavior:
>
> stream.apply(Window.into(FixedWindows.of(Duration.millis(10)))).apply(GroupByKey.create())
> or
> stream.apply(
> Window.into(FixedWindows.of(Duration.millis(10)))
>
> .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
> .apply(GroupByKey.create())
>
> 1ms event1
> 3ms event2
> 11ms event3 - (triger window)
> 12ms event4
> 13ms event5
> 21ms event6 - (triger window)
> 22ms event7
> 23ms event8
>
> <nothing next 5 min>
>
> 5m00ms event9
>
> As result event7 and event8 stay in windows without processing next 5 min
> Window and GroupBy will create only on event9
>
> Behavior can reproduce on DirectRunner and FlinkRunner.
>
> This is bug or incorrect using API from my side?
>
--
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com
Re: Timer and Window behavior
Posted by Raghu Angadi <ra...@google.com.INVALID>.
On Sun, Nov 6, 2016 at 4:31 AM, Demin Alexey <di...@gmail.com> wrote:
> This is bug or incorrect using API from my side?
This is a bug in KafkaIO. It should advance the watermark when there are no
messages to read. https://issues.apache.org/jira/browse/BEAM-591
I want to fix it, but may not get to it until Thanksgiving, will see.