You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/03/05 09:25:43 UTC

[GitHub] [airflow] HansBambel opened a new issue #14620: Using env in BashOperator results in invalid SyntaxError

HansBambel opened a new issue #14620:
URL: https://github.com/apache/airflow/issues/14620


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   **Apache Airflow version**: 2.0.1
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release): OSX11 arm64
   - **Kernel** (e.g. `uname -a`): Darwin 20.3.0
   - **Install tools**: pip
   - **Others**:
   
   **What happened**:
   
   I am executing python scripts using the `BashOperator`. For my scripts to work I need to set some environment variables for connecting to the database. Until a few hours ago I was explicitly exporting them in the same bash command in the `BashOperator`:
   ```
   my_task = BashOperator(
           task_id='my-task',
           bash_command=f"export DB_URL={db_url}; echo -e 'y' | python /path/to/my/script/my_script.py"
       )
   ```
   This works, but in the logs we can see the password in plain text. Therefore, I tried using the `env` parameter and pass the variable there:
   ```
   my_task = BashOperator(
           task_id='my-task',
           bash_command=f"echo -e 'y' | python /path/to/my/script/my_script.py",
           env={'DB_URL': db_url}
       )
   ```
   This results in an Error where it is claimed that there is a SyntaxError in my Python script which executes fine without using `env`:
   ```
   [2021-03-05 10:13:20,416] {bash.py:158} INFO - Running command: echo -e 'y' | python  /path/to/my/script/my_script.py
   [2021-03-05 10:13:20,424] {bash.py:169} INFO - Output:
   [2021-03-05 10:13:20,518] {bash.py:173} INFO -   File "/path/to/my/script/my_script.py", line 16
   [2021-03-05 10:13:20,530] {bash.py:173} INFO -     n_rows: int = query.count()
   [2021-03-05 10:13:20,531] {bash.py:173} INFO -           ^
   [2021-03-05 10:13:20,531] {bash.py:173} INFO - SyntaxError: invalid syntax
   [2021-03-05 10:13:20,531] {bash.py:177} INFO - Command exited with return code 1
   [2021-03-05 10:13:20,544] {taskinstance.py:1455} ERROR - Bash command failed. The command returned a non-zero exit code.
   ```
   <!-- (please include exact error messages if you can) -->
   
   **What you expected to happen**:
   Executing the script flawlessly with the environment variable set and not showing in the log.
   <!-- What do you think went wrong? -->
   
   **How to reproduce it**:
   Create a file `hello.py`:
   ```
   import os
   
   print(f"Hello {os.environ.get('name')}")
   ```
   And create a DAG:
   ```
   from datetime import timedelta, datetime
   from airflow import DAG
   from airflow.operators.bash import BashOperator
   
   default_args = {
       'owner': 'Bilbo',
       'depends_on_past': False,
       'email': ['bilbo@the-shire.com'],
       'email_on_failure': False,
       'email_on_retry': False,
       'retries': 0,
       'retry_delay': timedelta(minutes=5),
   }
   
   with DAG(
       dag_id='test-env',
       default_args=default_args,
       description='Test env',
       schedule_interval=None,
       start_date=datetime(2021, 2, 24)
   ) as dag:
   
       my_task = BashOperator(
           task_id='my-task',
           bash_command=f"python /Users/bilbo/hello.py",
           env={'name': 'AirFlow'}
       )
   ```
   
   **Anything else we need to know**:
   
   <!--
   
   How often does this problem occur? Once? Every time etc?
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>x.log</summary> lots of stuff </details>
   
   -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj closed issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
mik-laj closed issue #14620:
URL: https://github.com/apache/airflow/issues/14620


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] HansBambel commented on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
HansBambel commented on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791990941


   Ok, so the `env` parameter creates a new environment and does not add environment variables. Got it, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] HansBambel commented on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
HansBambel commented on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791940308


   Interesting... That did the trick! Thank you!
   Then I misunderstood something. I was assuming that when I install and run AirFlow in an environment (in my case an Anaconda environment) then all subprocesses such as DAGs will inherit from that environment as well.
   
   Initially, I was creating BashOperators that pointed exactly to the python from the env such as this:
   `/home/Users/anaconda3/env/myEnv/bin/python`, but I realized that when I start AirFlow in that environment already I get that already. Seems I was mistaken.
   
   Is there a possibility that my DAG is run in a specific conda environment? Or is starting AirFlow in that environment and adding `env{**os.environ}` to the tasks the way to go?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791784644


   This doesn't look like a problem with Aiirflow, but your script is not compatible with Python 2 and the default system interpreter is Python 2 on your system. You probably use virtual environments and you didn't pass the information on to the subprocess.
   
   Can you try the code below?
   ````
       my_task = BashOperator(
           task_id='my-task',
           bash_command=f"python /Users/bilbo/hello.py",
           env={**os.environ, 'name': 'AirFlow'}
       )
   
   ````


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791784644


   This doesn't look like a problem with Aiirflow, but your script is not compatible with Python 2 and the default system interpreter is Python 2 on your system. You probably use virtual environments and you didn't pass the information on to the subprocess.
   
   Can you try the code below?
   ````python
       my_task = BashOperator(
           task_id='my-task',
           bash_command=f"python /Users/bilbo/hello.py",
           env={**os.environ, 'name': 'AirFlow'}
       )
   
   ````


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791290371


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #14620: Using env in BashOperator results in invalid SyntaxError

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #14620:
URL: https://github.com/apache/airflow/issues/14620#issuecomment-791943024


   > all subprocesses such as DAGs will inherit from that environment as well.
   
   It does, but you created a new environment that contains only one environment variable so it was not possible to find the correct environment.
   
   > Is there a possibility that my DAG is run in a specific conda environment? Or is starting AirFlow in that environment and adding env{**os.environ} to the tasks the way to go?
   
   When you run a subprocess and use the env parameter, you must pass the information about all environment variables explicitly. We do not have any additional magic here to provide more flexibility, e.g. some users do not want to pass the credential information from the main process to the child processes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org