You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Lautaro Quiroz (Jira)" <ji...@apache.org> on 2021/02/17 15:24:00 UTC

[jira] [Updated] (BEAM-11826) Var

     [ https://issues.apache.org/jira/browse/BEAM-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lautaro Quiroz updated BEAM-11826:
----------------------------------
    Description: 
Hi, I'm getting the following error when using apache-beam 2.27.0 in a Python3 virtualenv:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
Traceback (most recent call last):
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line 1, in <module>
    import apache_beam
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py", line 95, in <module>
    from apache_beam import coders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py", line 19, in <module>
    from apache_beam.coders.coders import *
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 43, in <module>
    from future.moves import pickle
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py", line 8, in <module>
    import_top_level_modules()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 810, in import_top_level_modules
    with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 781, in __enter__
    module = __import__(m, level=0)
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line 3, in <module>
    from apache_beam.options.pipeline_options import PipelineOptions
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py", line 41, in <module>
    from apache_beam.transforms.display import HasDisplayData
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py", line 23, in <module>
    from apache_beam.transforms import combiners
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py", line 45, in <module>
    from apache_beam.transforms import core
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py", line 40, in <module>
    from apache_beam.coders import typecoders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 198, in <module>
    registry = CoderRegistry()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 91, in __init__
    self.register_standard_coders(fallback_coder)
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 95, in register_standard_coders
    self._register_coder_internal(int, coders.VarIntCoder)
AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'
{code}
My `dummy.py` file consists of only:
{code:python}
import apache_beam
if __name__ == '__main__': print('MAIN')
{code}
Strangely to me, I do not get the error when running the venv python interactively and executing the `import apache_beam` statement inside:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python
Python 3.6.10 (default, Feb  1 2021, 12:07:35)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import apache_beam
>>>
{code}
Note: In both cases the `sys.path` packages dirs are exactly the same.

Equally strangely to me, the script runs in both scenarios (script & interactive) when I use the previous version `apache-beam==0.26.0`.

---

A bit of context.
I got to this error cause Im using a Google Cloud Composer (Airflow) operator (`airflow.providers.google.cloud.operators.dataflow.DataflowCreatePythonJobOperator`) that launches the creation of a virtualenv & installs apache-beam, and executes the beam script in order the trigger run the Beam pipeline on Google Cloud DataFlow.

I'm being able to reproduce these issue by connecting to the Kubernetes pod running this operator and manually executing the steps.

—

In order to reproduce this issue, you can:

1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
 2. create the dummy.py file with the following code inside:
 ```
 import apache_beam
 ```
 3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install apache-beam==2.27.0`.
 4. run the script: `/tmp/venv/bin/python dummy.py`.
 5. check it does not happen with `apache-beam==2.26.0`.

—

Also reported in: [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]

  was:
Hi, I'm getting the following error when using apache-beam 2.27.0 in a Python3 virtualenv:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
Traceback (most recent call last):
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line 1, in <module>
    import apache_beam
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py", line 95, in <module>
    from apache_beam import coders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py", line 19, in <module>
    from apache_beam.coders.coders import *
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 43, in <module>
    from future.moves import pickle
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py", line 8, in <module>
    import_top_level_modules()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 810, in import_top_level_modules
    with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 781, in __enter__
    module = __import__(m, level=0)
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line 3, in <module>
    from apache_beam.options.pipeline_options import PipelineOptions
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py", line 41, in <module>
    from apache_beam.transforms.display import HasDisplayData
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py", line 23, in <module>
    from apache_beam.transforms import combiners
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py", line 45, in <module>
    from apache_beam.transforms import core
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py", line 40, in <module>
    from apache_beam.coders import typecoders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 198, in <module>
    registry = CoderRegistry()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 91, in __init__
    self.register_standard_coders(fallback_coder)
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 95, in register_standard_coders
    self._register_coder_internal(int, coders.VarIntCoder)
AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'
{code}
My `dummy.py` file consists of only:
{code:python}
import apache_beam
if __name__ == '__main__': print('MAIN')
{code}
Strangely to me, I do not get the error when running the venv python interactively and executing the `import apache_beam` statement inside:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python
Python 3.6.10 (default, Feb  1 2021, 12:07:35)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import apache_beam
>>>
{code}
Note: In both cases the `sys.path` packages dirs are exactly the same.

Equally strangely to me, the script runs in both scenarios (script & interactive) when I use the previous version `apache-beam==0.26.0`.

---

A bit of context.
I got to this error cause Im using a Google Cloud Composer (Airflow) operator () that launches the creation of a virtualenv & installs apache-beam, and executes the beam script in order the trigger run the Beam pipeline on Google Cloud DataFlow.

I'm being able to reproduce these issue by connecting to the Kubernetes pod running this operator and manually executing the steps.

—

In order to reproduce this issue, you can:

1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
 2. create the dummy.py file with the following code inside:
 ```
 import apache_beam
 ```
 3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install apache-beam==2.27.0`.
 4. run the script: `/tmp/venv/bin/python dummy.py`.
 5. check it does not happen with `apache-beam==2.26.0`.

—

Also reported in: [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]


> Var
> ---
>
>                 Key: BEAM-11826
>                 URL: https://issues.apache.org/jira/browse/BEAM-11826
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.27.0
>         Environment: OS: Ubuntu 18.04.5 LTS
> Python: 3.6.10
> GCC: 7.5.0
>            Reporter: Lautaro Quiroz
>            Priority: P2
>             Fix For: 2.26.0
>
>
> Hi, I'm getting the following error when using apache-beam 2.27.0 in a Python3 virtualenv:
> {code:bash}
> airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
> Traceback (most recent call last):
>   File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line 1, in <module>
>     import apache_beam
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py", line 95, in <module>
>     from apache_beam import coders
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py", line 19, in <module>
>     from apache_beam.coders.coders import *
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 43, in <module>
>     from future.moves import pickle
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py", line 8, in <module>
>     import_top_level_modules()
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 810, in import_top_level_modules
>     with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 781, in __enter__
>     module = __import__(m, level=0)
>   File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line 3, in <module>
>     from apache_beam.options.pipeline_options import PipelineOptions
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py", line 41, in <module>
>     from apache_beam.transforms.display import HasDisplayData
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py", line 23, in <module>
>     from apache_beam.transforms import combiners
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py", line 45, in <module>
>     from apache_beam.transforms import core
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py", line 40, in <module>
>     from apache_beam.coders import typecoders
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 198, in <module>
>     registry = CoderRegistry()
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 91, in __init__
>     self.register_standard_coders(fallback_coder)
>   File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 95, in register_standard_coders
>     self._register_coder_internal(int, coders.VarIntCoder)
> AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'
> {code}
> My `dummy.py` file consists of only:
> {code:python}
> import apache_beam
> if __name__ == '__main__': print('MAIN')
> {code}
> Strangely to me, I do not get the error when running the venv python interactively and executing the `import apache_beam` statement inside:
> {code:bash}
> airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python
> Python 3.6.10 (default, Feb  1 2021, 12:07:35)
> [GCC 7.5.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import apache_beam
> >>>
> {code}
> Note: In both cases the `sys.path` packages dirs are exactly the same.
> Equally strangely to me, the script runs in both scenarios (script & interactive) when I use the previous version `apache-beam==0.26.0`.
> ---
> A bit of context.
> I got to this error cause Im using a Google Cloud Composer (Airflow) operator (`airflow.providers.google.cloud.operators.dataflow.DataflowCreatePythonJobOperator`) that launches the creation of a virtualenv & installs apache-beam, and executes the beam script in order the trigger run the Beam pipeline on Google Cloud DataFlow.
> I'm being able to reproduce these issue by connecting to the Kubernetes pod running this operator and manually executing the steps.
> —
> In order to reproduce this issue, you can:
> 1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
>  2. create the dummy.py file with the following code inside:
>  ```
>  import apache_beam
>  ```
>  3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install apache-beam==2.27.0`.
>  4. run the script: `/tmp/venv/bin/python dummy.py`.
>  5. check it does not happen with `apache-beam==2.26.0`.
> —
> Also reported in: [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)