You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "David Hartig (Jira)" <ji...@apache.org> on 2019/11/08 21:14:00 UTC

[jira] [Created] (AIRFLOW-5881) Dag gets stuck in "Scheduled" State when scheduling a large number of tasks

David Hartig created AIRFLOW-5881:
-------------------------------------

             Summary: Dag gets stuck in "Scheduled" State when scheduling a large number of tasks
                 Key: AIRFLOW-5881
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5881
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.10.6
            Reporter: David Hartig
         Attachments: 2 (1).log, airflow.cnf

Running with the KubernetesExecutor in and AKS cluster, when we upgraded to version 1.10.6 we noticed that the all the Dags stop making progress but start running and immediate exiting with the following message:

"Instance State' FAILED: Task is in the 'scheduled' state which is not a valid state for execution. The task must be cleared in order to be run."

See attached log file for the worker. Nothing seems out of the ordinary in the Scheduler log. 

Reverting to 1.10.5 clears the problem.

Note that at the time of the failure maybe 100 or so tasks are in this state, with 70 coming from one highly parallelized dag.

Attached is the also a redacted airflow config flag. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)