You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Rishi (Jira)" <ji...@apache.org> on 2022/03/30 21:11:00 UTC

[jira] [Updated] (BEAM-12512) Parquetio.py throws "ValueError: invalid literal" when ARROW_MAJOR_VERSION contains alpha-numeric

     [ https://issues.apache.org/jira/browse/BEAM-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rishi updated BEAM-12512:
-------------------------
    Fix Version/s: 2.37.0
       Resolution: Resolved
           Status: Resolved  (was: Open)

> Parquetio.py throws "ValueError: invalid literal" when ARROW_MAJOR_VERSION contains alpha-numeric
> -------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-12512
>                 URL: https://issues.apache.org/jira/browse/BEAM-12512
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-parquet
>         Environment: Ubuntu 18.04
>            Reporter: Rishi
>            Priority: P3
>             Fix For: 2.37.0
>
>
> When Apache Arrow is built from Git branch the resulting version is similar to:
> /==================/
> /home/arrow/python# python3 setup.py --version
>  *2.0.0.dev0+g478286658.d20210618*
> /==================/
>  This causes exception in apache_beam code at the following [line |https://github.com/apache/beam/blob/9af555d9ccdb0d7a378dbea456cdeefe2e781d6d/sdks/python/apache_beam/io/parquetio.py#L53]due to presence of alpha-numerics in the generated code:
> /==================/
>  # python3
>  Python 3.6.9 (default, Jan 26 2021, 15:33:00)
>  [GCC 8.4.0] on linux
>  Type "help", "copyright", "credits" or "license" for more information.
>  >>> *import apache_beam as beam*
>  Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/__init__.py", line 96, in <module>
>  from apache_beam import io
>  File "/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/__init__.py", line 28, in <module>
>  from apache_beam.io.parquetio import *
>  File "/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/parquetio.py", line 53, in <module>
>  ARROW_MAJOR_VERSION, _, _ = map(int, pa.__version__.split('.'))
>  *ValueError: invalid literal for int() with base 10: 'dev0+g478286658'*
> /==================/
> Perhaps, determination of ARROW_MAJOR_VERSION can be modified to account for such use cases.
>  
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)