You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/28 17:00:21 UTC

[GitHub] [airflow] SamWheating commented on a change in pull request #22410: Fixing task status for non-running and non-committed tasks

SamWheating commented on a change in pull request #22410:
URL: https://github.com/apache/airflow/pull/22410#discussion_r836651441



##########
File path: airflow/api/common/mark_tasks.py
##########
@@ -468,7 +468,22 @@ def set_dag_run_state_to_failed(
         task.dag = dag
         tasks.append(task)
 
-    return set_state(tasks=tasks, run_id=run_id, state=State.FAILED, commit=commit, session=session)
+    # Mark non-finished tasks as SKIPPED.
+    task_ids = [task.task_id for task in dag.tasks]
+    tis = session.query(TaskInstance).filter(
+        TaskInstance.dag_id == dag.dag_id,
+        TaskInstance.run_id == run_id,
+        TaskInstance.task_id.in_(task_ids),
+        TaskInstance.state.not_in(State.finished),
+        TaskInstance.state.not_in(State.running),
+    )

Review comment:
       ```suggestion
       tis = session.query(TaskInstance).filter(
           TaskInstance.dag_id == dag.dag_id,
           TaskInstance.run_id == run_id,
           TaskInstance.state.not_in(State.finished),
           TaskInstance.state.not_in(State.running),
       )
   ```
   
   Is the filter on task_id necessary? I'm wondering if its redundant since we're already filtering on `dag_id` and `run_id`, which should just return all of the tasks from that DagRun?
   
   I think that there's also some weird race conditions here around changes to the DAG while its in-flight, such that `dag.tasks` might not be completely in-sync with the taskInstances which exist in the DB. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org