You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/05 10:57:00 UTC

[GitHub] [airflow] ashb commented on a change in pull request #13433: Schedule tasks of cleared dags

ashb commented on a change in pull request #13433:
URL: https://github.com/apache/airflow/pull/13433#discussion_r551856920



##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -1483,37 +1483,46 @@ def _do_scheduling(self, session) -> int:
             # Bulk fetch the currently active dag runs for the dags we are
             # examining, rather than making one query per DagRun
 
-            # TODO: This query is probably horribly inefficient (though there is an
-            # index on (dag_id,state)). It is to deal with the case when a user
+            # This query will violate max_active_runs by exactly one if tasks in
+            # max_active_runs or more DAGs are cleared while another DAG is
+            # running. It is to deal with the case when a user
             # clears more than max_active_runs older tasks -- we don't want the
             # scheduler to suddenly go and start running tasks from all of the
             # runs. (AIRFLOW-137/GH #1442)
-            #
-            # The longer term fix would be to have `clear` do this, and put DagRuns
-            # in to the queued state, then take DRs out of queued before creating
-            # any new ones

Review comment:
       Please keep this comment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org