You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Jonathan Hourany (Jira)" <ji...@apache.org> on 2021/09/16 01:00:03 UTC
[jira] [Comment Edited] (BEAM-12803) SqlTransform doesn't work on
python 3.9
[ https://issues.apache.org/jira/browse/BEAM-12803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415818#comment-17415818 ]
Jonathan Hourany edited comment on BEAM-12803 at 9/16/21, 12:59 AM:
--------------------------------------------------------------------
I just got bit by this too. I dove in to see if I could figure out what the problem was and I think I figured it out. The problematic line might be here [on line #160 in typehints.schemas|https://github.com/apache/beam/blob/3a3933caa1cdb11e6584bce87f8494979ab95cb1/sdks/python/apache_beam/typehints/schemas.py#L160] [.|https://github.com/apache/beam/blob/3a3933caa1cdb11e6584bce87f8494979ab95cb1/sdks/python/apache_beam/typehints/schemas.py#L160].] The line calls
{code:python}
name=name, type=typing_to_runner_api(type_._field_types[name]))
{code}
And it might be the call to {{_field_types}} that's the culprit. According to [What’s New In Python 3.9|https://docs.python.org/3/whatsnew/3.9.html]
{quote}The _field_types _attribute of the typing.NamedTuple class has been removed. It was deprecated since Python 3.8. Use the __annotations__ attribute instead. (Contributed by Serhiy Storchaka in bpo-40182.)
{quote}
The fix may be as easy as swapping {{_field_types}} out with {{_{{annotations}}_}}. I'll see if I can test this out when I get some down time.
was (Author: jonathan hourany):
I just got bit by this too. I dove in to see if I could figure out what the problem was and I think I figured it out. The problematic line might be here [on line #160 in typehints.schemas|https://github.com/apache/beam/blob/3a3933caa1cdb11e6584bce87f8494979ab95cb1/sdks/python/apache_beam/typehints/schemas.py#L160] [.|https://github.com/apache/beam/blob/3a3933caa1cdb11e6584bce87f8494979ab95cb1/sdks/python/apache_beam/typehints/schemas.py#L160].] The line calls
{code:python}
name=name, type=typing_to_runner_api(type_._field_types[name]))
{code}
And it might be the call to {{_field_types}} that's the culprit. According to [What’s New In Python 3.9|https://docs.python.org/3/whatsnew/3.9.html]
{quote}The _field_types _attribute of the typing.NamedTuple class has been removed. It was deprecated since Python 3.8. Use the __annotations__ attribute instead. (Contributed by Serhiy Storchaka in bpo-40182.)
{quote}
The fix may be as easy as swapping {{_field_types}} out with {{__annotations__}}. I'll see if I can test this out when I get some down time.
> SqlTransform doesn't work on python 3.9
> ---------------------------------------
>
> Key: BEAM-12803
> URL: https://issues.apache.org/jira/browse/BEAM-12803
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: sean teeling
> Assignee: Brian Hulette
> Priority: P2
>
> Working example below -(Is there no way to paste pre-formatted code into jira?!)- (EDIT: I added the appropriate "code" block)
> {code:python}
> import itertools
> import csv
> import io
> import apache_beam as beam
> from apache_beam.dataframe.io import read_csv
> from apache_beam.transforms.sql import SqlTransform
> def parse_csv(val):
> deflower_headers(iterator):
> return itertools.chain([next(iterator).lower()], iterator)
> return csv.DictReader(lower_headers(io.TextIOWrapper(val.open())))
> class BeamTransformBuilder():
> def build(self, pipeline):
> practices = (
> pipeline
> | beam.io.fileio.MatchFiles("data.csv")
> | beam.io.fileio.ReadMatches()
> | beam.Reshuffle()
> | beam.FlatMap(parse_csv)
> | beam.Map(lambda x: beam.Row(id="test-id"))
> | SqlTransform("""
> SELECT
> id
> FROM PCOLLECTION""")
> )
> practices | beam.Map(print)
> def main():
> builder = BeamTransformBuilder()
> with beam.Pipeline('DirectRunner') as p:
> builder.build(p)
> if __name__ == '__main__':
> main()
> {code}
>
> Results in the error:
>
> {code:java}
> File "/usr/local/lib/python3.9/site-packages/apache_beam/typehints/schemas.py", line 185, in typing_to_runner_api
> element_type = typing_to_runner_api(_get_args(type_)[0])
> IndexError: tuple index out of range
> {code}
>
>
> Tested on Python 3.9.6.
>
> Annoyingly, it is difficult to test this out on other python versions. There's no documentation for how to setup a docker container using DirectRunner and running it locally. There's barely any documentation on what python versions are supported. And using pyenv, and pip install apache-beam requires a lot of other downloads that have conflicts when other versions are already installed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)