You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2022/03/15 17:26:00 UTC

[jira] [Commented] (BEAM-13217) TypeCheckError due to CoGroupByKey output mis-deduction

    [ https://issues.apache.org/jira/browse/BEAM-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507095#comment-17507095 ] 

Beam JIRA Bot commented on BEAM-13217:
--------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.


> TypeCheckError due to CoGroupByKey output mis-deduction
> -------------------------------------------------------
>
>                 Key: BEAM-13217
>                 URL: https://issues.apache.org/jira/browse/BEAM-13217
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.32.0, 2.33.0, 2.34.0, 2.35.0
>            Reporter: Willi Schinmeyer
>            Priority: P2
>              Labels: stale-P2
>
> After upgrading our Python project from 2.31.0 to 2.33.0, we started getting TypeCheckErrors such as
> {quote}apache_beam.typehints.decorators.TypeCheckError: Type hint violation for 'all_data/combine_new_and_all': requires {{Tuple[Tuple[Any, Any], Dict[str, Iterable[_CombinedEntry]]]}} but got {{Tuple[Tuple[int, int], Dict[str, List[Union[]]]]}} for element
> {quote}
> where the output value of a {{CoGroupByKey()}} is apparently incorrectly deduced to be a {{Dict[str, List[Union[]]]}}.
> I managed to build a small repro case:
> {code:python}
> import apache_beam as beam
> from typing import Dict, Iterable, Tuple
> {
>     "foo": [(42, "foo")],
>     "bar": [(42, "bar")],
> } | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]])
> {code}
> which raises
> {quote}apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected {{Tuple[int, Dict[str, Iterable[str]]]}}, got {{Tuple[int, Dict[str, List[Union[]]]]}}
> {quote}
> or alternatively, using a TestPipeline:
> {code:python}
> import apache_beam as beam
> from apache_beam.testing.test_pipeline import TestPipeline
> from apache_beam.testing.util import assert_that, equal_to
> from typing import Dict, Iterable, Tuple
> with TestPipeline() as p:
>     actual = {
>         "foo": p | "create_foo" >> beam.Create([(42, "foo")]),
>         "bar": p | "create_bar" >> beam.Create([(42, "bar")]),
>     } | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]])
>     assert_that(actual, equal_to([(42, {"foo": ["foo"], "bar": ["bar"]})]))
> {code}
> Oh, and one more thing, about that {{Tuple[Any, Any]}} from the original error message I posted. We can reproduce that like this:
> {code:python}
> import apache_beam as beam
> from typing import Dict, Iterable, NewType, Tuple
> key = NewType("key", int)
> {
>     "foo": [(key(1337), "foo")],
>     "bar": [(key(1337), "bar")],
> } | beam.CoGroupByKey().with_output_types(Tuple[key, Dict[str, Iterable[str]]])
> {code}
> {quote}apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected {{Tuple[Any, Dict[str, Iterable[str]]]}}, got {{Tuple[int, Dict[str, List[Union[]]]]}}
> {quote}
> It looks like {{NewType}} is treated as {{Any}}? That surprised me.
> I could also reproduce the issue in 2.32.0.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)