You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/25 14:51:34 UTC

[GitHub] [airflow] kaxil opened a new pull request #13893: Fix race condition when using Dynamic DAGs

kaxil opened a new pull request #13893:
URL: https://github.com/apache/airflow/pull/13893


   closes https://github.com/apache/airflow/issues/13504
   
   Currently the DagFileProcessor parses the DAG files, writes it to the
   `dag` table and then writes DAGs to `serialized_dag` table.
   
   At the same time, the scheduler loop is constantly looking for the next
   DAGs to process based on ``next_dagrun_create_after`` column of the DAG
   table.
   
   It might happen that as soon as the DagFileProcessor writes DAG to `dag`
   table, the scheduling loop in the Scheduler picks up the DAG for preocessing.
   However, as the DagFileProcessor has not written to serialized DAG table yet
   the scheduler will error with "Serialized Dag not Found" error.
   
   This would mainly happen when the DAGs are dynamic where the result of one DAG,
   creates multiple DAGs.
   
   This commit changes the order of writing DAG and Serialized DAG and hence
   before a DAG is written to `dag` table it will be written to `serialized_dag` table.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190


   @AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-767403524


   Thanks @kaxil for the fix, we'll try that out today and let you know if there's any issues.
   
   I haven't been able to reproduce the error locally yet - I imagine with a local DB connection the DAG serialization is happening too fast to trigger the race condition. I'm wondering if adding a large file import to one of the operators could artificially slow it down a bit, but I don't know enough about how the serialization works to know for sure.
   
   I've only managed to get the error locally by removing a row from the serialized_dag table. It looks like the try/excepts you've added would handle this though as well.
   
   Here's the script I was toying with locally anyway to try and trigger the error, in case that's of use (increase the value in the range function to generate more DAGs):
   
   ```
   from airflow.decorators import dag, task
   from airflow.utils.dates import days_ago
   
   
   def create_dag(dag_id):
       @dag(
           default_args={"owner": "airflow", "start_date": days_ago(1)},
           schedule_interval="@once",
           dag_id=dag_id,
           catchup=False,
       )
       def dynamic_dag():
           @task()
           def dynamic_task_1(**kwargs):
               return 1
   
           task_1 = dynamic_task_1()
           task_2 = dynamic_task_1(task_1)
   
       return dynamic_dag()
   
   
   for i in range(2):
       dag_id = f"dynamic_dag_{i}"
       globals()[dag_id] = create_dag(dag_id)
   
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-773170011


   @tooptoop4 I haven't seen the issue since using the fix - looks like it has worked


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fjmacagno edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
fjmacagno edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774353483


   I am still seeing a lot of errors in 2.0.1rc2 like
   ```
   airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
   dag table
   ```
   Is this not part of rc2? Or is it likely a different problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tooptoop4 commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-772811037


   did it work @nik-davis ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-766959587


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] AramisN commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
AramisN commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935694176


   Said its been fixed in 2.0.x but i see it again in 2.1.x :|
   
   Something bad has happened.
   Please consider letting us know by creating a bug report using GitHub.
   
   Python version: 3.8.5
   Airflow version: 2.1.1
   Node: f723608ad587
   -------------------------------------------------------------------------------
   Traceback (most recent call last):
     File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
       response = self.full_dispatch_request()
     File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
       rv = self.handle_user_exception(e)
     File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
       reraise(exc_type, exc_value, tb)
     File "/root/miniconda3/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
       raise value
     File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
       rv = self.dispatch_request()
     File "/root/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
       return self.view_functions[rule.endpoint](**req.view_args)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/auth.py", line 34, in decorated
       return func(*args, **kwargs)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/decorators.py", line 97, in view_func
       return f(*args, **kwargs)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/decorators.py", line 60, in wrapper
       return f(*args, **kwargs)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/www/views.py", line 2170, in graph
       dag = current_app.dag_bag.get_dag(dag_id)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/models/dagbag.py", line 186, in get_dag
       self._add_dag_from_db(dag_id=dag_id, session=session)
     File "/root/miniconda3/lib/python3.8/site-packages/airflow/models/dagbag.py", line 258, in _add_dag_from_db
       raise SerializedDagNotFound(f"DAG '{dag_id}' not found in serialized_dag table")
   airflow.exceptions.SerializedDagNotFound: DAG 'DLS_SYS_REGION' not found in serialized_dag table


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nik-davis commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
nik-davis commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-773170011


   @tooptoop4 I haven't seen the issue since using the fix - looks like it has worked


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fjmacagno commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
fjmacagno commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774353483


   I am still seeing a lot of errors in 2.0.0rc2 like
   ```
   airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
   dag table
   ```
   Is this not part of rc2? Or is it likely a different problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-766959587


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #13893:
URL: https://github.com/apache/airflow/pull/13893


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774374643


   > I am still seeing a lot of errors in 2.0.0rc2 like
   > 
   > ```
   > airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
   > dag table
   > ```
   > 
   > Is this not part of rc2? Or is it likely a different problem?
   
   This is included in 2.0.1rc2 (not 2.0.0rc2) -- and yeah your problem might be different, difficult to tell without knowing details


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #13893:
URL: https://github.com/apache/airflow/pull/13893


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fjmacagno commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
fjmacagno commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-775310886


   Damn, oh well. I will put in a new issue.
   
   (I meant 2.0.1, typo)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-774374643


   > I am still seeing a lot of errors in 2.0.0rc2 like
   > 
   > ```
   > airflow.exceptions.SerializedDagNotFound: DAG 'targeting_pipeline_dappack.shard_3' not found in serialized_
   > dag table
   > ```
   > 
   > Is this not part of rc2? Or is it likely a different problem?
   
   This is included in 2.0.1rc2 -- and yeah your problem might be different, difficult to tell without knowing details


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190


   @AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver.
   
   Please create an new issue with steps to reproduce


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #13893: Fix race condition when using Dynamic DAGs

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #13893:
URL: https://github.com/apache/airflow/pull/13893#issuecomment-935740190


   @AramisN There are many reasons for getting "not found in serialized_dag table" -- you might be facing a different error. The PR fixed breaking the scheduler, and from the stacktrace you are seeing this error on webserver.
   
   Please create a new issue with steps to reproduce


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org