You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "t oo (Jira)" <ji...@apache.org> on 2020/02/26 23:25:00 UTC
[jira] [Created] (AIRFLOW-6934) max_active_runs from different dag
in dagbag stopping any task from running
t oo created AIRFLOW-6934:
-----------------------------
Summary: max_active_runs from different dag in dagbag stopping any task from running
Key: AIRFLOW-6934
URL: https://issues.apache.org/jira/browse/AIRFLOW-6934
Project: Apache Airflow
Issue Type: Bug
Components: scheduler
Affects Versions: 1.10.7
Reporter: t oo
I have a one .py that creates multiple dagids (it is a dynamic dag generator, so 25 diff dag ids created, including dagA and dagB). I have max_active_runs_per_dag =5 in .cfg. I then did airflow cli triggerdag for dagA for 7 diff execdates in parallel and triggerdag for dagB for 4 diff execdates in parallel. From looking in the UI the dagA showed red in the schedule column. There were tasks in scheduled & queued state in both dagA and dagB but there were no tasks in running state (even over last 3 hrs!). The scheduler was still up though and running tasks from dagC (which is created from a different .py than the .py that creates dagA and dagB). I noticed this message printed in the scheduler logs frequently: "Number of active dag runs reached max_active_run."
From tracing the code I think this is what happens:
_process_file (https://github.com/apache/airflow/blob/1.10.7/airflow/jobs/scheduler_job.py#L1512-L1588) runs at level of .py (so many diff dagids)
it calls _process_dags
for each dagid from that .py it calls _process_task_instances
_process_task_instances has a counter (active_dag_runs) which is appended for each dag being iterated over, it breaks out of the loop (the loop which appends ids to a list) if the counter > max_active_runs_per_dag (from .cfg). I couldn't see where task_instances_list gets used though
I'm using localexecutor, v1.10.7
--
This message was sent by Atlassian Jira
(v8.3.4#803005)