You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/10 10:35:43 UTC

[GitHub] [airflow] ashb commented on a change in pull request #20349: Fix Scheduler crash when executing task instances of missing DAG

ashb commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r781064291



##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -402,6 +402,14 @@ def _executable_task_instances_to_queued(self, max_tis: int, session: Session =
                     # Many dags don't have a task_concurrency, so where we can avoid loading the full
                     # serialized DAG the better.
                     serialized_dag = self.dagbag.get_dag(dag_id, session=session)
+                    # If the dag is missing, continue to the next task.
+                    if not serialized_dag:
+                        self.log.error(
+                            "DAG '%s' for taskinstance %s not found in serialized_dag table",
+                            dag_id,
+                            task_instance,
+                        )
+                        continue

Review comment:
       This error/reproduction step is not quite right, but the same idea can trigger this behaviour -- if the dag is deleted at the "right" time, this bit of the scheduler will fail.
   
   I think in that case though we should fail the task instances as the DAG doesn't exist anymore, and as TP said, it can't run successfully.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org