You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "mziwisky (via GitHub)" <gi...@apache.org> on 2023/09/07 05:45:16 UTC
[GitHub] [airflow] mziwisky opened a new issue, #34154: "@task.virtualenv" cannot appear in a comment inside a `@task.virtualenv` task
mziwisky opened a new issue, #34154:
URL: https://github.com/apache/airflow/issues/34154
### Apache Airflow version
Other Airflow 2 version (please specify below)
### What happened
On Airflow 2.5.1 (on AWS MWAA), I ran this DAG:
```python
from datetime import datetime, timedelta
from airflow.decorators import dag, task
@task.virtualenv(system_site_packages=True)
def test():
print('This line gets printed')
# @task.virtualenv
print('This line does not')
raise Exception("The task will succeed because this line won't run either")
@dag(
start_date=datetime(2023, 9, 6),
schedule="10 * * * *",
)
def bug_test():
test()
the_dag = bug_test()
```
The `test` task runs and succeeds, but it only executes the code that precedes the line `# @task.virtualenv`. Everything after that line is ignored.
In general, as long as "@task.virtualenv" is in the comment, it will kill all lines after it. The comment could be `# we're inside an @task.virtualenv` or `# a@task.virtualenva`, same effect. If the comment does _not_ have exactly that string in it, e.g. `# we're inside a task.virtualenv` (no "@") or `# we're inside an @ task.virtualenv` then the lines after it get executed.
### What you think should happen instead
All of the code in the task should get executed. Here's a log dump for a run of that task:
```
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1083}} INFO - Dependencies all met for <TaskInstance: bug_test.test scheduled__2023-09-07T04:10:00+00:00 [queued]>
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1083}} INFO - Dependencies all met for <TaskInstance: bug_test.test scheduled__2023-09-07T04:10:00+00:00 [queued]>
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1279}} INFO -
--------------------------------------------------------------------------------
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1280}} INFO - Starting attempt 5 of 5
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1281}} INFO -
--------------------------------------------------------------------------------
[2023-09-07, 05:37:34 UTC] {{taskinstance.py:1300}} INFO - Executing <Task(_PythonVirtualenvDecoratedOperator): test> on 2023-09-07 04:10:00+00:00
[2023-09-07, 05:37:34 UTC] {{standard_task_runner.py:55}} INFO - Started process 10023 to run task
[2023-09-07, 05:37:34 UTC] {{standard_task_runner.py:82}} INFO - Running: ['airflow', 'tasks', 'run', 'bug_test', 'test', 'scheduled__2023-09-07T04:10:00+00:00', '--job-id', '2069253', '--raw', '--subdir', 'DAGS_FOLDER/bug.py', '--cfg-path', '/tmp/tmpz3vkh0z2']
[2023-09-07, 05:37:34 UTC] {{standard_task_runner.py:83}} INFO - Job 2069253: Subtask test
[2023-09-07, 05:37:34 UTC] {{task_command.py:388}} INFO - Running <TaskInstance: bug_test.test scheduled__2023-09-07T04:10:00+00:00 [running]> on host ip-10-42-8-187.us-west-2.compute.internal
[2023-09-07, 05:37:35 UTC] {{taskinstance.py:1507}} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=bug_test
AIRFLOW_CTX_TASK_ID=test
AIRFLOW_CTX_EXECUTION_DATE=2023-09-07T04:10:00+00:00
AIRFLOW_CTX_TRY_NUMBER=5
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-09-07T04:10:00+00:00
[2023-09-07, 05:37:35 UTC] {{process_utils.py:179}} INFO - Executing cmd: /usr/bin/python3.10 -m virtualenv /tmp/venva253igp3 --system-site-packages
[2023-09-07, 05:37:35 UTC] {{process_utils.py:183}} INFO - Output:
[2023-09-07, 05:37:35 UTC] {{process_utils.py:187}} INFO - created virtual environment CPython3.10.8.final.0-64 in 387ms
[2023-09-07, 05:37:35 UTC] {{process_utils.py:187}} INFO - creator CPython3Posix(dest=/tmp/venva253igp3, clear=False, no_vcs_ignore=False, global=True)
[2023-09-07, 05:37:35 UTC] {{process_utils.py:187}} INFO - seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/usr/local/airflow/.local/share/virtualenv)
[2023-09-07, 05:37:35 UTC] {{process_utils.py:187}} INFO - added seed packages: pip==22.3.1, setuptools==65.6.3, wheel==0.38.4
[2023-09-07, 05:37:35 UTC] {{process_utils.py:187}} INFO - activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
[2023-09-07, 05:37:36 UTC] {{process_utils.py:179}} INFO - Executing cmd: /tmp/venva253igp3/bin/pip install -r /tmp/venva253igp3/requirements.txt
[2023-09-07, 05:37:36 UTC] {{process_utils.py:183}} INFO - Output:
[2023-09-07, 05:37:43 UTC] {{logging_mixin.py:137}} WARNING - /usr/local/airflow/.local/lib/python3.10/site-packages/watchtower/__init__.py:349 WatchtowerWarning: Received empty message. Empty messages cannot be sent to CloudWatch Logs
[2023-09-07, 05:37:43 UTC] {{logging_mixin.py:137}} WARNING - Traceback (most recent call last):
[2023-09-07, 05:37:43 UTC] {{logging_mixin.py:137}} WARNING - File "/usr/local/airflow/config/cloudwatch_logging.py", line 161, in emit
self.sniff_errors(record)
[2023-09-07, 05:37:43 UTC] {{logging_mixin.py:137}} WARNING - File "/usr/local/airflow/config/cloudwatch_logging.py", line 212, in sniff_errors
if pattern.search(record.message):
[2023-09-07, 05:37:43 UTC] {{logging_mixin.py:137}} WARNING - AttributeError: 'LogRecord' object has no attribute 'message'
[2023-09-07, 05:37:43 UTC] {{process_utils.py:187}} INFO - [notice] A new release of pip available: 22.3.1 -> 23.2.1
[2023-09-07, 05:37:43 UTC] {{process_utils.py:187}} INFO - [notice] To update, run: /tmp/venva253igp3/bin/python -m pip install --upgrade pip
[2023-09-07, 05:37:43 UTC] {{process_utils.py:179}} INFO - Executing cmd: /tmp/venva253igp3/bin/python /tmp/venva253igp3/script.py /tmp/venva253igp3/script.in /tmp/venva253igp3/script.out /tmp/venva253igp3/string_args.txt
[2023-09-07, 05:37:43 UTC] {{process_utils.py:183}} INFO - Output:
[2023-09-07, 05:37:46 UTC] {{process_utils.py:187}} INFO - This line gets printed
[2023-09-07, 05:37:46 UTC] {{python.py:177}} INFO - Done. Returned value was: None
[2023-09-07, 05:37:46 UTC] {{taskinstance.py:1318}} INFO - Marking task as SUCCESS. dag_id=bug_test, task_id=test, execution_date=20230907T041000, start_date=20230907T053734, end_date=20230907T053746
[2023-09-07, 05:37:46 UTC] {{local_task_job.py:208}} INFO - Task exited with return code 0
```
### How to reproduce
Described above
### Operating System
Linux? it's AWS MWAA
### Versions of Apache Airflow Providers
_No response_
### Deployment
Amazon (AWS) MWAA
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] "@task.virtualenv" cannot appear in a comment inside a `@task.virtualenv` task [airflow]
Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal closed issue #34154: "@task.virtualenv" cannot appear in a comment inside a `@task.virtualenv` task
URL: https://github.com/apache/airflow/issues/34154
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #34154: "@task.virtualenv" cannot appear in a comment inside a `@task.virtualenv` task
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34154:
URL: https://github.com/apache/airflow/issues/34154#issuecomment-1709512256
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamWheating commented on issue #34154: "@task.virtualenv" cannot appear in a comment inside a `@task.virtualenv` task
Posted by "SamWheating (via GitHub)" <gi...@apache.org>.
SamWheating commented on issue #34154:
URL: https://github.com/apache/airflow/issues/34154#issuecomment-1721651606
I was able to replicase on `main`... this is a really funny bug and appears to stem from this section of code:
https://github.com/apache/airflow/blob/827962878e6fb39e014639d83cff7d0881595ecb/airflow/utils/decorators.py#L59-L81
Since if the decorator name is included in the code, the result of the `split` operation will be more than 2 items, and anything after the second use of the decorator will be ignored.
Knowing this, we can replicate the issue in other ways too - this task will also succeed:
```python
def test_virtualenv():
print('This line gets printed')
# @setup @setup
print('This line does not')
raise Exception("The task will succeed because this line won't run either")
```
We probably shouldn't be doing string manipulation on code anyways, could we maybe use some sort of actual AST parsing for this?
Anyways, I will keep investigating, feel free to assign to me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org