You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/03 14:11:38 UTC
[GitHub] [airflow] wojsamjan commented on pull request #17285: Add info log how to fix: More than one pod running with labels
wojsamjan commented on pull request #17285:
URL: https://github.com/apache/airflow/pull/17285#issuecomment-891880603
> The labels should already be unique:
> https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L311-L316
>
> Any idea how you are getting more than 1 pod with the same `try_number`? Maybe more than 1 instance in the same namespace running the same DAG?
Hi, please take a look at:
https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L347-L354
and
https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L412-L416
As you can see, the _try_number_ is excluded from _label_selector_. That is why we end up with more than 1 pod running with the same label. If you would include it, it works like that: every retry creates new & unique labelled pod. I am not the author of the operator. I am not sure why it works like that now - it looks like someone´s design. The simplest solution is to ensure that we delete our pod once it´s succeed or failed using the flag - _is_delete_operator_pod_. What is more, it does not break the current behaviour of the **KubernetesPodOperator**.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org