You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/18 13:14:59 UTC

[GitHub] [airflow] RikHeijdens opened a new issue #13151: Task Instances in the "removed" state prevent the scheduler from scheduling new tasks when max_active_runs is set

RikHeijdens opened a new issue #13151:
URL: https://github.com/apache/airflow/issues/13151


   **Apache Airflow version**: 2.0.0
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   **Environment**:
   
   - **OS** (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
   - **Kernel** (e.g. `uname -a`): Linux 6ae65b86e112 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020 x86_64 GNU/Linux
   - **Others**: Python 3.8
   
   **What happened**:
   
   After migrating one of our development Airflow instances from 1.10.14 to 2.0.0, the scheduler started to refuse to schedule tasks for a DAG that did not actually exceed its `max_active_runs`.
   
   When it did this the following error would be logged:
   
   ```
   DAG <dag_name> already has 577 active runs, not queuing any tasks for run 2020-12-17 08:05:00+00:00
   ```
   
   A bit of digging revealed that this DAG had task instances associated with it that are in the `removed` state. As soon as I forced the task instances that are in the `removed` state into the `failed` state, the tasks would be scheduled.
   
   I believe the root cause of the issue is that [this filter](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L1506) does not filter out tasks that are in the `removed` state.
   
   **What you expected to happen**:
   
   I expected the task instances in the DAG to be scheduled, because the DAG did not actually exceed the number of `max_active_runs`.
   
   **How to reproduce it**:
   
   I think the best approach to reproduce it is as follows:
   1. Create a DAG and set `max_active_runs` to 1.
   2. Ensure the DAG has ran successfully a number of times, such that it has some history associated with it.
   3. Set one historical task instance to the `removed` state (either by directly updating it in the DB, or deleting a task from a DAG before it has been able to execute).
   
   **Anything else we need to know**:
   
   The Airflow instance that I ran into this issue with contains about 3 years of task history, which means that we actually had quite a few task instances that are in the `removed` state, but there is no easy way to delete those from the Web UI.
   
   A work around is to set the tasks to `failed`, which will allow the scheduler to proceed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13151: Task Instances in the "removed" state prevent the scheduler from scheduling new tasks when max_active_runs is set

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13151:
URL: https://github.com/apache/airflow/issues/13151#issuecomment-748362129


   https://github.com/apache/airflow/pull/13165 Should fix it


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #13151: Task Instances in the "removed" state prevent the scheduler from scheduling new tasks when max_active_runs is set

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #13151:
URL: https://github.com/apache/airflow/issues/13151


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org