You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Lautaro Quiroz (Jira)" <ji...@apache.org> on 2021/02/17 15:22:00 UTC
[jira] [Created] (BEAM-11826) Var
Lautaro Quiroz created BEAM-11826:
-------------------------------------
Summary: Var
Key: BEAM-11826
URL: https://issues.apache.org/jira/browse/BEAM-11826
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.27.0
Environment: OS: Ubuntu 18.04.5 LTS
Python: 3.6.10
GCC: 7.5.0
Reporter: Lautaro Quiroz
Fix For: 2.26.0
Hi, I'm getting the following error when using apache-beam 2.27.0 in a Python3 virtualenv:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
Traceback (most recent call last):
File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line 1, in <module>
import apache_beam
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py", line 95, in <module>
from apache_beam import coders
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py", line 19, in <module>
from apache_beam.coders.coders import *
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 43, in <module>
from future.moves import pickle
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py", line 8, in <module>
import_top_level_modules()
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 810, in import_top_level_modules
with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py", line 781, in __enter__
module = __import__(m, level=0)
File "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line 3, in <module>
from apache_beam.options.pipeline_options import PipelineOptions
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py", line 41, in <module>
from apache_beam.transforms.display import HasDisplayData
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py", line 23, in <module>
from apache_beam.transforms import combiners
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py", line 45, in <module>
from apache_beam.transforms import core
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py", line 40, in <module>
from apache_beam.coders import typecoders
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 198, in <module>
registry = CoderRegistry()
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 91, in __init__
self.register_standard_coders(fallback_coder)
File "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py", line 95, in register_standard_coders
self._register_coder_internal(int, coders.VarIntCoder)
AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'
{code}
My `dummy.py` file consists of only:
{code:python}
import apache_beam
if __name__ == '__main__': print('MAIN')
{code}
Strangely to me, I do not get the error when running the venv python interactively and executing the `import apache_beam` statement inside:
{code:bash}
airflow@airflow-worker-7fb797d459-nf8gh:~$ /tmp/dataflow-venvtflya9ij/bin/python
Python 3.6.10 (default, Feb 1 2021, 12:07:35)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import apache_beam
>>>
{code}
Note: In both cases the `sys.path` packages dirs are exactly the same.
Equally strangely to me, the script runs in both scenarios (script & interactive) when I use the previous version `apache-beam==0.26.0`.
---
A bit of context.
I got to this error cause Im using a Google Cloud Composer (Airflow) operator () that launches the creation of a virtualenv & installs apache-beam, and executes the beam script in order the trigger run the Beam pipeline on Google Cloud DataFlow.
I'm being able to reproduce these issue by connecting to the Kubernetes pod running this operator and manually executing the steps.
—
In order to reproduce this issue, you can:
1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
2. create the dummy.py file with the following code inside:
```
import apache_beam
```
3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install apache-beam==2.27.0`.
4. run the script: `/tmp/venv/bin/python dummy.py`.
5. check it does not happen with `apache-beam==2.26.0`.
—
Also reported in: [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)