You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "gky249 (via GitHub)" <gi...@apache.org> on 2023/10/16 05:26:16 UTC
[I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
gky249 opened a new issue, #34958:
URL: https://github.com/apache/airflow/issues/34958
### Apache Airflow version
2.7.2
### What happened
Hi Guys,
All our dags which use the PythonVirtualEnvOperator and call a python function using python_callable param, they fail due to FileNotFoundError for the tmp_dir/script.out file.
Can someone explain why this is happening and what is expected for the script.out file? Is the python function supposed to print or return someting to write into the script.out file?
```
Executing cmd: /tmp/venvx9cl8fpi/bin/python /tmp/venvx9cl8fpi/script.py /tmp/venvx9cl8fpi/script.in /tmp/venvx9cl8fpi/script.out /tmp/venvx9cl8fpi/string_args.txt
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 356, in execute
return super().execute(context=serializable_context)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 175, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 553, in execute_callable
return self._execute_python_callable_in_subprocess(python_path, tmp_path)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 421, in _execute_python_callable_in_subprocess
return self._read_result(output_path)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 373, in _read_result
if path.stat().st_size == 0:
File "/usr/local/lib/python3.9/pathlib.py", line 1232, in stat
return self._accessor.stat(self)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/venvx9cl8fpi/script.out'
```
### What you think should happen instead
It seems that the PythonVirtualEnv operator class expects mandatory arguement for output file directory containing 'script.out'. This behaviour is not happening in PythonOperator but we have to use the Virtual env operator to pass specific python requirements also. What needs to be passed here - https://github.com/apache/airflow/blob/main/airflow/operators/python.py#L426
### How to reproduce
Create DAG which calls Python script using PythonVirtualEnv operator. DAG fails expecting 'script.out' file to be present in directory.
### Operating System
Debian 11 Bullseye
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #34958: PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail
URL: https://github.com/apache/airflow/issues/34958
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1764981102
Feel free
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "2Rahul02 (via GitHub)" <gi...@apache.org>.
2Rahul02 commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1764967197
Hey, i wanna work on it. can you please assign it to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1763936076
Likely you have a problem in your parameters that are causing the script to fail before the output script is created.
The script that is getting executed is https://github.com/apache/airflow/blob/main/airflow/utils/python_virtualenv_script.jinja2 and when you look at it, it should create the file as empty file automatically in the "Write output step" below.
Likely your Python code (args, op_kwargs or other) fails and raises exception before the file is created. So the root cause is somewhere in the parameters that you pass to the virtualenv operator - they likely cause the generated Python code to fail.
While we should probably handle the case better (hence marked that as good-first-issue for someone to pick it up) you should look for a cause in the parameters you passed. Adding here what parameters you use might be helpful for you to realise how to fix your code and for anyone (maybe even you ) who would like to improve it might help in reproducing it and adding better diagnostics for the situation.
So - please. add more information on what is the code that triggers this situation @gky249
-----
```python
import {{ pickling_library }}
import sys
{% if expect_airflow %}
{# Check whether Airflow is available in the environment.
# If it is, we'll want to ensure that we integrate any macros that are being provided
# by plugins prior to unpickling the task context. #}
if sys.version_info >= (3,6):
try:
from airflow.plugins_manager import integrate_macros_plugins
integrate_macros_plugins()
except ImportError:
{# Airflow is not available in this environment, therefore we won't
# be able to integrate any plugin macros. #}
pass
{% endif %}
{% if op_args or op_kwargs %}
with open(sys.argv[1], "rb") as file:
arg_dict = {{ pickling_library }}.load(file)
{% else %}
arg_dict = {"args": [], "kwargs": {}}
{% endif %}
{% if string_args_global | default(true) -%}
# Read string args
with open(sys.argv[3], "r") as file:
virtualenv_string_args = list(map(lambda x: x.strip(), list(file)))
{% endif %}
# Script
{{ python_callable_source }}
try:
res = {{ python_callable }}(*arg_dict["args"], **arg_dict["kwargs"])
except Exception as e:
with open(sys.argv[4], "w") as file:
file.write(str(e))
raise
# Write output
with open(sys.argv[2], "wb") as file:
if res is not None:
{{ pickling_library }}.dump(res, file)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1797064096
This issue has been closed because it has not received response from the issue author.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1786237147
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] PythonVirtualEnv Operator expecting output dir 'script.out' causing DAG to fail [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34958:
URL: https://github.com/apache/airflow/issues/34958#issuecomment-1763748353
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org