You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airflow.apache.org by an...@gmail.com, an...@gmail.com on 2019/05/29 03:39:16 UTC

Task retries are not exhausted when using K8s executor

HI ,
I am seeing this behaviour where if a running pod terminates with a non zero code, the executor seems to be marking the task as "FAILED"

In these cases, Kubewatcher gets an event "Failed" for the pod and based on that in the _change_state() method - in this section of the code https://github.com/apache/airflow/blob/a8a4d322ee960ef51a03a87db44fe352abb910e6/airflow/executors/kubernetes_executor.py#L801,
 the tasks are marked as failed. There is no check on task being eligible for retry.

I feel just adding the task, state to the event_buffer in this method should all we be doing, like in other executors.

I want to know from the community/devs if there is any particular reason for marking the task "FAILED" here.

Thanks,
Anand