You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Willi Schinmeyer (Jira)" <ji...@apache.org> on 2021/11/10 15:33:00 UTC
[jira] [Created] (BEAM-13217) TypeCheckError due to CoGroupByKey
output mis-deduction
Willi Schinmeyer created BEAM-13217:
---------------------------------------
Summary: TypeCheckError due to CoGroupByKey output mis-deduction
Key: BEAM-13217
URL: https://issues.apache.org/jira/browse/BEAM-13217
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.33.0
Reporter: Willi Schinmeyer
After upgrading our Python project from 2.31.0 to 2.33.0, we started getting TypeCheckErrors such as
{quote}apache_beam.typehints.decorators.TypeCheckError: Type hint violation for 'all_data/combine_new_and_all': requires {{Tuple[Tuple[Any, Any], Dict[str, Iterable[_CombinedEntry]]]}} but got {{Tuple[Tuple[int, int], Dict[str, List[Union[]]]]}} for element
{quote}
where the output value of a {{CoGroupByKey()}} is apparently incorrectly deduced to be a {{Dict[str, List[Union[]]]}}.
I managed to build a small repro case:
{code:python}
import apache_beam as beam
from typing import Dict, Iterable, Tuple
{
"foo": [(42, "foo")],
"bar": [(42, "bar")],
} | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]])
{code}
which raises
{quote}apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected {{Tuple[int, Dict[str, Iterable[str]]]}}, got {{Tuple[int, Dict[str, List[Union[]]]]}}
{quote}
or alternatively, using a TestPipeline:
{code:python}
import apache_beam as beam
from apache_beam.testing.test_pipeline import TestPipeline
from apache_beam.testing.util import assert_that, equal_to
from typing import Dict, Iterable, Tuple
with TestPipeline() as p:
actual = {
"foo": p | "create_foo" >> beam.Create([(42, "foo")]),
"bar": p | "create_bar" >> beam.Create([(42, "bar")]),
} | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]])
assert_that(actual, equal_to([(42, {"foo": ["foo"], "bar": ["bar"]})]))
{code}
Oh, and one more thing, about that {{Tuple[Any, Any]}} from the original error message I posted. We can reproduce that like this:
{code:python}
import apache_beam as beam
from typing import Dict, Iterable, NewType, Tuple
key = NewType("key", int)
{
"foo": [(key(1337), "foo")],
"bar": [(key(1337), "bar")],
} | beam.CoGroupByKey().with_output_types(Tuple[key, Dict[str, Iterable[str]]])
{code}
{quote}apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected {{Tuple[Any, Dict[str, Iterable[str]]]}}, got {{Tuple[int, Dict[str, List[Union[]]]]}}
{quote}
It looks like {{NewType}} is treated as {{Any}}? That surprised me.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)