You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Winston Huang (JIRA)" <ji...@apache.org> on 2018/03/29 22:42:00 UTC
[jira] [Created] (AIRFLOW-2270) Subdag backfill spins on removed
tasks
Winston Huang created AIRFLOW-2270:
--------------------------------------
Summary: Subdag backfill spins on removed tasks
Key: AIRFLOW-2270
URL: https://issues.apache.org/jira/browse/AIRFLOW-2270
Project: Apache Airflow
Issue Type: Bug
Reporter: Winston Huang
My understanding is that subdag operators execute via a backfill job which runs in a loop, maintaining the state of the associated tasks and breaking only once all pending tasks have been exhausted: [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2159]
The issue is that this task instance status is initialized by this method [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2075,] which may include tasks with {{state = State.REMOVED}}, i.e. tasks that were previously instantiated in the database but removed from the dag definition. Hence, the task will be missing from this list [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2168] but will exist in {{ti_status.to_run}}. This causes the backfill job to loop indefinitely, since it considers those removed tasks to be pending but doesn't attempt to run them.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)