You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "gky249 (via GitHub)" <gi...@apache.org> on 2023/10/16 05:26:16 UTC

[I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

gky249 opened a new issue, #34958:
URL: https://github.com/apache/airflow/issues/34958

   ### Apache Airflow version
   
   2.7.2
   
   ### What happened
   
   Hi Guys,
   
   All our dags which use the PythonVirtualEnvOperator and call a python function using python_callable param, they fail due to FileNotFoundError for the tmp_dir/script.out file.
   Can someone explain why this is happening and what is expected for the script.out file? Is the python function supposed to print or return someting to write into the script.out file?
   
   ```
   Executing cmd: /tmp/venvx9cl8fpi/bin/python /tmp/venvx9cl8fpi/script.py /tmp/venvx9cl8fpi/script.in /tmp/venvx9cl8fpi/script.out /tmp/venvx9cl8fpi/string_args.txt
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 356, in execute
       return super().execute(context=serializable_context)
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 175, in execute
       return_value = self.execute_callable()
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 553, in execute_callable
       return self._execute_python_callable_in_subprocess(python_path, tmp_path)
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 421, in _execute_python_callable_in_subprocess
       return self._read_result(output_path)
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 373, in _read_result
       if path.stat().st_size == 0:
     File "/usr/local/lib/python3.9/pathlib.py", line 1232, in stat
       return self._accessor.stat(self)
   FileNotFoundError: [Errno 2] No such file or directory: '/tmp/venvx9cl8fpi/script.out'
   ```
   
   ### What you think should happen instead
   
   It seems that the PythonVirtualEnv operator class expects mandatory arguement for output file directory containing 'script.out'. This behaviour is not happening in PythonOperator but we have to use the Virtual env operator to pass specific python requirements also. What needs to be passed here - https://github.com/apache/airflow/blob/main/airflow/operators/python.py#L426
   
   
   ### How to reproduce
   
   Create DAG which calls Python script using PythonVirtualEnv operator. DAG fails expecting 'script.out' file to be present in directory.
   
   ### Operating System
   
   Debian 11 Bullseye
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #34958: PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail
URL: https://github.com/apache/airflow/issues/34958


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1764981102

   Feel free


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "2Rahul02 (via GitHub)" <gi...@apache.org>.
2Rahul02 commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1764967197

   Hey, i wanna work on it. can you please assign it to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1763936076

   Likely you have a problem in your parameters that are causing the script to fail before the output script is created.
   
   The script that is getting executed is https://github.com/apache/airflow/blob/main/airflow/utils/python_virtualenv_script.jinja2  and when you look at it, it should create the file as empty file automatically in the "Write output step" below.  
   
   Likely your Python code (args, op_kwargs or other) fails and raises exception before the file is created. So the root cause is somewhere in the parameters that you pass to the virtualenv operator - they likely cause the generated Python code to fail.
   
   While we should probably handle the case better (hence marked that as good-first-issue for someone to pick it up) you should look for a cause in the parameters you passed. Adding here what parameters you use might be helpful for you to realise how to fix your code and for anyone (maybe even you ) who would like to improve it might help in reproducing it and adding better diagnostics for the situation.
   
   So - please. add more information on what is the code that triggers this situation @gky249 
   
   -----
   
   ```python
   import {{ pickling_library }}
   import sys
   
   {% if expect_airflow %}
    {# Check whether Airflow is available in the environment.
    # If it is, we'll want to ensure that we integrate any macros that are being provided
    # by plugins prior to unpickling the task context. #}
   if sys.version_info >= (3,6):
       try:
           from airflow.plugins_manager import integrate_macros_plugins
           integrate_macros_plugins()
       except ImportError:
           {# Airflow is not available in this environment, therefore we won't
            # be able to integrate any plugin macros. #}
           pass
   {% endif %}
   
   {% if op_args or op_kwargs %}
   with open(sys.argv[1], "rb") as file:
       arg_dict = {{ pickling_library }}.load(file)
   {% else %}
   arg_dict = {"args": [], "kwargs": {}}
   {% endif %}
   
   {% if string_args_global | default(true) -%}
   # Read string args
   with open(sys.argv[3], "r") as file:
       virtualenv_string_args = list(map(lambda x: x.strip(), list(file)))
   {% endif %}
   
   # Script
   {{ python_callable_source }}
   try:
       res = {{ python_callable }}(*arg_dict["args"], **arg_dict["kwargs"])
   except Exception as e:
       with open(sys.argv[4], "w") as file:
           file.write(str(e))
       raise
   
   # Write output
   with open(sys.argv[2], "wb") as file:
       if res is not None:
           {{ pickling_library }}.dump(res, file)
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1797064096

   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1786237147

   This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1763748353

   Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org