You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/25 14:51:34 UTC
[GitHub] [airflow] kaxil opened a new pull request #13893: Fix race condition when using Dynamic DAGs
kaxil opened a new pull request #13893:
URL: https://github.com/apache/airflow/pull/13893
closes https://github.com/apache/airflow/issues/13504
Currently the DagFileProcessor parses the DAG files, writes it to the
`dag` table and then writes DAGs to `serialized_dag` table.
At the same time, the scheduler loop is constantly looking for the next
DAGs to process based on ``next_dagrun_create_after`` column of the DAG
table.
It might happen that as soon as the DagFileProcessor writes DAG to `dag`
table, the scheduling loop in the Scheduler picks up the DAG for preocessing.
However, as the DagFileProcessor has not written to serialized DAG table yet
the scheduler will error with "Serialized Dag not Found" error.
This would mainly happen when the DAGs are dynamic where the result of one DAG,
creates multiple DAGs.
This commit changes the order of writing DAG and Serialized DAG and hence
before a DAG is written to `dag` table it will be written to `serialized_dag` table.
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
---
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190
@AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-767403524
Thanks @kaxil for the fix, we'll try that out today and let you know if there's any issues.
I haven't been able to reproduce the error locally yet - I imagine with a local DB connection the DAG serialization is happening too fast to trigger the race condition. I'm wondering if adding a large file import to one of the operators could artificially slow it down a bit, but I don't know enough about how the serialization works to know for sure.
I've only managed to get the error locally by removing a row from the serialized_dag table. It looks like the try/excepts you've added would handle this though as well.
Here's the script I was toying with locally anyway to try and trigger the error, in case that's of use (increase the value in the range function to generate more DAGs):
```
from airflow.decorators import dag, task
from airflow.utils.dates import days_ago
def create_dag(dag_id):
@dag(
default_args={"owner": "airflow", "start_date": days_ago(1)},
schedule_interval="@once",
dag_id=dag_id,
catchup=False,
)
def dynamic_dag():
@task()
def dynamic_task_1(**kwargs):
return 1
task_1 = dynamic_task_1()
task_2 = dynamic_task_1(task_1)
return dynamic_dag()
for i in range(2):
dag_id = f"dynamic_dag_{i}"
globals()[dag_id] = create_dag(dag_id)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-773170011
@tooptoop4 I haven't seen the issue since using the fix - looks like it has worked
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] fjmacagno edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
fjmacagno edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774353483
I am still seeing a lot of errors in 2.0.1rc2 like
```
airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
dag table
```
Is this not part of rc2? Or is it likely a different problem?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] tooptoop4 commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-772811037
did it work @nik-davis ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-766959587
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] AramisN commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
AramisN commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935694176
Said its been fixed in 2.0.x but i see it again in 2.1.x :|
Something bad has happened.
Please consider letting us know by creating a bug report using GitHub.
Python version: 3.8.5
Airflow version: 2.1.1
Node: f723608ad587
-------------------------------------------------------------------------------
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/root/miniconda3/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/auth.py", line 34, in decorated
return func(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/decorators.py", line 97, in view_func
return f(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/decorators.py", line 60, in wrapper
return f(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
return func(*args, session=session, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/views.py", line 2170, in graph
dag = current_app.dag_bag.get_dag(dag_id)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
return func(*args, session=session, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/models/dagbag.py", line 186, in get_dag
self._add_dag_from_db(dag_id=dag_id, session=session)
File "/root/miniconda3/lib/python3.8/site-packages/airflow/models/dagbag.py", line 258, in _add_dag_from_db
raise SerializedDagNotFound(f"DAG '{dag_id}' not found in serialized_dag table")
airflow.exceptions.SerializedDagNotFound: DAG 'DLS_SYS_REGION' not found in serialized_dag table
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-773170011
@tooptoop4 I haven't seen the issue since using the fix - looks like it has worked
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] fjmacagno commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
fjmacagno commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774353483
I am still seeing a lot of errors in 2.0.0rc2 like
```
airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
dag table
```
Is this not part of rc2? Or is it likely a different problem?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-766959587
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil merged pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #13893:
URL: https://github.com/apache/airflow/pull/13893
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774374643
> I am still seeing a lot of errors in 2.0.0rc2 like
>
> ```
> airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
> dag table
> ```
>
> Is this not part of rc2? Or is it likely a different problem?
This is included in 2.0.1rc2 (not 2.0.0rc2) -- and yeah your problem might be different, difficult to tell without knowing details
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil merged pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #13893:
URL: https://github.com/apache/airflow/pull/13893
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] fjmacagno commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
fjmacagno commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-775310886
Damn, oh well. I will put in a new issue.
(I meant 2.0.1, typo)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil commented on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774374643
> I am still seeing a lot of errors in 2.0.0rc2 like
>
> ```
> airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
> dag table
> ```
>
> Is this not part of rc2? Or is it likely a different problem?
This is included in 2.0.1rc2 -- and yeah your problem might be different, difficult to tell without knowing details
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190
@AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver.
Please create an new issue with steps to reproduce
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs
Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190
@AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver.
Please create a new issue with steps to reproduce
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org