You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jacques Gehlen (JIRA)" <ji...@apache.org> on 2019/07/18 13:56:00 UTC

[jira] [Commented] (AIRFLOW-4986) PATH variable isn't passed down when the worker while using run_as_user in the DAG

    [ https://issues.apache.org/jira/browse/AIRFLOW-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888005#comment-16888005 ] 

Jacques Gehlen commented on AIRFLOW-4986:
-----------------------------------------

Same issue here with virtualenv.. But I'm not changing PATH variable for airflow, it shouldn't even be necessary in a virtualenv.

I'm trying to migrate a non-virtualenv Airflow to a virtualenv.
Using Airflow v1.10.1 and a DAG containing:
{noformat}
run_as_user: 'root'{noformat}
And my systemd script contains:
{noformat}
EnvironmentFile=/etc/sysconfig/airflow
...
ExecStart=/usr/bin/bash -c "source /opt/python-venv/airflow/bin/activate ; /opt/python-venv/airflow/bin/airflow scheduler"{noformat}
If I remove the run_as_user setting from my DAG, the DAG task will start correctly (but permission denied errors, that's why we need run_as_user).

> PATH variable  isn't passed down when the worker while using run_as_user in the DAG
> -----------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4986
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4986
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: worker
>    Affects Versions: 1.10.3
>            Reporter: Aneesh Joseph
>            Priority: Major
>
> I have installed airflow into a virtual environment and I start up the airflow worker using a systemd script with 
> {code:java}
> EnvironmentFile=/mnt/airflow/airflow.envExecStart=/bin/bash -a -c 'export AIRFLOW_HOME && export PATH && exec /mnt/airflow/venv/bin/airflow worker --queues queue1,queue2'{code}
> /mnt/airflow/airflow.env looks like this
>  
> {code:java}
> AIRFLOW_HOME=/mnt/airflow/
> PATH=/mnt/airflow/venv/bin/:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin{code}
>  
> The service starts up ok and the worker shows up on the celery flower UI. The worker correctly picks up the DAGs scheduled on queue1 or queue2, but the tasks with a  *run_as_user* setting fail without showing much info on the Airflow UI logs, but looking at the worker log for that specific task instance shows this. 
>  
> {noformat}
> [2019-07-17 17:09:54,644] {__init__.py:1374} INFO - Executing <Task(BashOperator): sample_task> on 2019-07-17T17:09:48.710579+00:00
> [2019-07-17 17:09:54,644] {base_task_runner.py:119} INFO - Running: [u'sudo', u'-E', u'-H', u'-u', 'myuser', u'airflow', u'run', 'sample-team-sample-pipeline', 'sample_task', '2019-07-17T17:09:48.710579+00:00', u'--job_id', '471', u'--raw', u'-sd', u'DAGS_FOLDER/sample_team/sample-dag.py', u'--cfg_path', '/tmp/tmpZe8CVI']
> [2019-07-17 17:09:54,662] {base_task_runner.py:101} INFO - Job 471: Subtask sample_task sudo: airflow: command not found {noformat}
> Tasks which doesn't have the run_as_user setting runs alright. Workers where airflow is globally installed(and not in a virtual env) works alright which probably suggests me that the PATH variable is not being passed down when the worker triggers the actual task bit with sudo. 
>  
> Is there a way to get around this?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)