You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Lautaro Quiroz (Jira)" <ji...@apache.org> on 2021/02/17 15:22:00 UTC

[jira] [Created] (BEAM-11826) Var

Lautaro Quiroz created BEAM-11826:
-------------------------------------

             Summary: Var
                 Key: BEAM-11826
                 URL: https://issues.apache.org/jira/browse/BEAM-11826
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
    Affects Versions: 2.27.0
         Environment: OS: Ubuntu 18.04.5 LTS
Python: 3.6.10
GCC: 7.5.0
            Reporter: Lautaro Quiroz
             Fix For: 2.26.0


Hi, I'm getting the following error when using apache-beam 2.27.0 in a Python3 virtualenv:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
Traceback (most recent call last):
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line 1, in <module>
    import apache_beam
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py", line 95, in <module>
    from apache_beam import coders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py", line 19, in <module>
    from apache_beam.coders.coders import *
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 43, in <module>
    from future.moves import pickle
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py", line 8, in <module>
    import_top_level_modules()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 810, in import_top_level_modules
    with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 781, in __enter__
    module = __import__(m, level=0)
  File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line 3, in <module>
    from apache_beam.options.pipeline_options import PipelineOptions
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py", line 41, in <module>
    from apache_beam.transforms.display import HasDisplayData
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py", line 23, in <module>
    from apache_beam.transforms import combiners
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py", line 45, in <module>
    from apache_beam.transforms import core
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py", line 40, in <module>
    from apache_beam.coders import typecoders
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 198, in <module>
    registry = CoderRegistry()
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 91, in __init__
    self.register_standard_coders(fallback_coder)
  File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 95, in register_standard_coders
    self._register_coder_internal(int, coders.VarIntCoder)
AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'
{code}
My `dummy.py` file consists of only:
{code:python}
import apache_beam
if __name__ == '__main__': print('MAIN')
{code}
Strangely to me, I do not get the error when running the venv python interactively and executing the `import apache_beam` statement inside:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python
Python 3.6.10 (default, Feb  1 2021, 12:07:35)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import apache_beam
>>>
{code}
Note: In both cases the `sys.path` packages dirs are exactly the same.

Equally strangely to me, the script runs in both scenarios (script & interactive) when I use the previous version `apache-beam==0.26.0`.

---

A bit of context.
I got to this error cause Im using a Google Cloud Composer (Airflow) operator () that launches the creation of a virtualenv & installs apache-beam, and executes the beam script in order the trigger run the Beam pipeline on Google Cloud DataFlow.

I'm being able to reproduce these issue by connecting to the Kubernetes pod running this operator and manually executing the steps.

—

In order to reproduce this issue, you can:

1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
 2. create the dummy.py file with the following code inside:
 ```
 import apache_beam
 ```
 3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install apache-beam==2.27.0`.
 4. run the script: `/tmp/venv/bin/python dummy.py`.
 5. check it does not happen with `apache-beam==2.26.0`.

—

Also reported in: [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)