You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Ismaël Mejía (JIRA)" <ji...@apache.org> on 2017/06/05 14:30:04 UTC

[jira] [Comment Edited] (BEAM-2409) Spark runner produces exactly twice the number of results in streaming mode when use triggers to re-window results on global window.

    [ https://issues.apache.org/jira/browse/BEAM-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037035#comment-16037035 ] 

Ismaël Mejía edited comment on BEAM-2409 at 6/5/17 2:29 PM:
------------------------------------------------------------

Don't know if related but a comment by [~baibaichen] on Beam's slack gchannel seems related:

??I noticed that LateDataUtils.dropExpiredWindows() return FluentIterable, and the result is passed to ReduceFnRunner.processElements(). this iterator will at least iterate twice, it's better to use FluentIterable#toList to avoid the extra iteration.??


was (Author: iemejia):
Don't know if related but a comment by [~baibaichen] on Beam's slack gchannel seems related:

??
I noticed that LateDataUtils.dropExpiredWindows() return FluentIterable, and the result is passed to ReduceFnRunner.processElements(). this iterator will at least iterate twice, it's better to use FluentIterable#toList to avoid the extra iteration.
??

> Spark runner produces exactly twice the number of results in streaming mode when use triggers to re-window results on global window.
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-2409
>                 URL: https://issues.apache.org/jira/browse/BEAM-2409
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>    Affects Versions: 2.0.0
>            Reporter: Ismaël Mejía
>            Assignee: Aviem Zur
>
> This can be tested with Nexmark query 6. Sorry I don’t have a smaller test case than this, but I think the part of the pipeline that produces the result is this one.
> {code:java}
>         .apply(
>             Window.<KV<Long, Bid>>into(new GlobalWindows())
>                 .triggering(Repeatedly.forever(AfterPane.elementCountAtLeast(1)))
>                 .accumulatingFiredPanes()
>                 .withAllowedLateness(Duration.ZERO))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)