You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by "Tang Jijun (上海_技术部_数据平台_唐觊隽)" <ta...@yhd.com> on 2017/05/08 07:08:03 UTC

Learn testLateDataAccumulating

I am looking the testLateDataAccumulating method in CreateStreamTest.But I can’t unstand the code reded .

    PCollection<Integer> windowed = p
        .apply(source)
        .apply(Window.<Integer>into(FixedWindows.of(Duration.standardMinutes(5))).triggering(
            AfterWatermark.pastEndOfWindow()
                .withEarlyFirings(AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(2)))
                .withLateFirings(AfterPane.elementCountAtLeast(1)))
            .accumulatingFiredPanes()
            .withAllowedLateness(Duration.standardMinutes(5), Window.ClosingBehavior.FIRE_ALWAYS));

Can anyone helps unstanding these triggers?

Re: Learn testLateDataAccumulating

Posted by Lukasz Cwik <lc...@google.com>.
There are 3 groupings of firings, before the watermark has passed the end
of the window, when the watermark reaches the end of the window, and after
the watermark has passed the end of the window.
They typically represent before and after watermark represent speculative
and late data respectively.

AfterWatermark.pastEndOfWindow()
Produce a pane of output once the watermark is past the end of the window

.withEarlyFirings(AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(2)))
Produce speculative pane of output once you see an element and at least 2
mins have passed, and then repeat

.withLateFirings(AfterPane.elementCountAtLeast(1)))
Produce late pane of output after you have seen at least one element, and
then repeat

.accumulatingFiredPanes() tells us that we want to accumulate the results
of each pane, so all future outputs will contain all prior output data.

.withAllowedLateness(Duration.standardMinutes(5),
Window.ClosingBehavior.FIRE_ALWAYS)); tells us that after the watermark has
past the end of window by 5 mins, we can start dropping data

These additional links may be of use as well:
https://beam.apache.org/documentation/programming-guide/#triggers
https://beam.apache.org/get-started/mobile-gaming-example/
https://beam.apache.org/documentation/resources/#technical-details


On Mon, May 8, 2017 at 12:08 AM, Tang Jijun(上海_技术部_数据平台_唐觊隽) <
tangjijun@yhd.com> wrote:

> I am looking the testLateDataAccumulating method in CreateStreamTest.But I
> can’t unstand the code reded .
>
>
>
>     PCollection<Integer> windowed = p
>
>         .apply(source)
>
>         .apply(Window.<Integer>into(FixedWindows.of(Duration.
> standardMinutes(5))).triggering(
>
>             AfterWatermark.pastEndOfWindow()
>
>                 .withEarlyFirings(AfterProcessingTime.
> pastFirstElementInPane()
>
>                     .plusDelayOf(Duration.standardMinutes(2)))
>
>                 .withLateFirings(AfterPane.elementCountAtLeast(1)))
>
>             .accumulatingFiredPanes()
>
>             .withAllowedLateness(Duration.standardMinutes(5),
> Window.ClosingBehavior.FIRE_ALWAYS));
>
>
>
> Can anyone helps unstanding these triggers?
>