You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Zachary Houfek (Jira)" <ji...@apache.org> on 2021/10/06 18:57:00 UTC

[jira] [Comment Edited] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

    [ https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425155#comment-17425155 ] 

Zachary Houfek edited comment on BEAM-9487 at 10/6/21, 6:56 PM:
----------------------------------------------------------------

Reopening until [PR15603|https://github.com/apache/beam/pull/15603] is merged.


was (Author: zhoufek):
Reopening until PR15603 is merged.

> GBKs on unbounded pcolls with global windows and no triggers should fail
> ------------------------------------------------------------------------
>
>                 Key: BEAM-9487
>                 URL: https://issues.apache.org/jira/browse/BEAM-9487
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Udi Meiri
>            Assignee: Zachary Houfek
>            Priority: P1
>              Labels: EaseOfUse, starter
>             Fix For: 2.31.0
>
>          Time Spent: 33h 50m
>  Remaining Estimate: 0h
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded PCollections without setting either a non-global windowing strategy, a trigger strategy, or both for each collection, Beam generates an IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
>     with TestPipeline() as p:
>       test_stream = TestStream()
>       _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)