You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/16 17:53:07 UTC

[GitHub] [airflow] dhuang opened a new issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

dhuang opened a new issue #17638:
URL: https://github.com/apache/airflow/issues/17638


   **Apache Airflow version**: 2.1.2.
   
   **OS**: Debian.
   
   **Apache Airflow Provider versions**: Probably not relevant.
   
   **Deployment**: Single scheduler instance since we're on MySQL5.7.
   
   **What happened**: 
   Since updating from 1.10.15 to 2.1.2, we started noticing a small subset of DAGs would no longer get new DAG Runs scheduled (roughly 30 of 5000~ DAGs), while the rest worked perfectly fine. We've been able to trigger manual runs with these DAGs with no issues and found no other errors/warnings in any logs. When restarting the scheduler, we'd sometimes see the next interval get scheduled, but then once again get stuck after the first new run. 
   
   After some investigation, I noticed a common attribute among these stuck DAGs were that their `next_dagrun_create_after=NULL`, they were DAGs we set to `max_active_runs=1`, and more often they were on shorter intervals (every 5-15min, but sometimes still daily). These DAGs are otherwise all different and are a mix of static/dynamic.
   
   **What you expected to happen**: 
   Digging into the new scheduler logic, I saw that the `next_dagrun_create_after` is getting set to `NULL` when `max_active_runs` is reached in https://github.com/apache/airflow/blob/2.1.2/airflow/models/dag.py#L2304. Then I think the filter in https://github.com/apache/airflow/blob/2.1.2/airflow/models/dag.py#L2276 would prevent the DAG from getting considered for scheduling again until `next_dagrun_create_after` is set to a non-null value again.
   
   I think that is supposed to happen in https://github.com/apache/airflow/blob/2.1.2/airflow/models/dag.py#L229, but `next_dagrun_create_after` remains stuck at `NULL` after all pending runs are complete. I verified when the prior DAG run finishes that `max_active_runs` is indeed not met by querying the database directly and saw no "DAG %s is at (or above) max_active_runs (%d of %d), not creating any more runs". If I update `next_dagrun_create_after` manually, a run will be scheduled right away, but then get stuck again after that.
   
   Can workaround by getting rid of `max_active_runs` which returns all scheduling to normal, but obviously then gets rid of the desired cap.
   
   **How to reproduce it**: Shortest way probably create a DAG with `max_active_runs=1`, `scheduler_interval="0 */1 * * *"`, and a`BashOperator`  task that sleeps for 5min?
   
   **Anything else we need to know**: Nothing else in mind.
   
   **Are you willing to submit a PR?** Yes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #17638:
URL: https://github.com/apache/airflow/issues/17638


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17638:
URL: https://github.com/apache/airflow/issues/17638#issuecomment-900891477


   Sounds similar to #14205?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #17638:
URL: https://github.com/apache/airflow/issues/17638#issuecomment-922144179


   Fixed by https://github.com/apache/airflow/pull/17945


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17638:
URL: https://github.com/apache/airflow/issues/17638#issuecomment-900891477


   Sounds similar to #14205?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #17638: Airflow stops scheduling DAG Runs after max_active_runs hit once

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17638:
URL: https://github.com/apache/airflow/issues/17638#issuecomment-899702015


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org