You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/03 14:11:38 UTC

[GitHub] [airflow] wojsamjan commented on pull request #17285: Add info log how to fix: More than one pod running with labels

wojsamjan commented on pull request #17285:
URL: https://github.com/apache/airflow/pull/17285#issuecomment-891880603


   > The labels should already be unique:
   > https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L311-L316
   > 
   > Any idea how you are getting more than 1 pod with the same `try_number`? Maybe more than 1 instance in the same namespace running the same DAG?
   
   Hi, please take a look at:
   
   https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L347-L354
   
   and
   
   https://github.com/apache/airflow/blob/667a45cf86763cc954e985787bca1b46d61cb8f3/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L412-L416
   
   As you can see, the _try_number_ is excluded from _label_selector_. That is why we end up with more than 1 pod running with the same label. If you would include it, it works like that: every retry creates new & unique labelled pod. I am not the author of the operator. I am not sure why it works like that now - it looks like someone´s design. The simplest solution is to ensure that we delete our pod once it´s succeed or failed using the flag - _is_delete_operator_pod_. What is more, it does not break the current behaviour of the **KubernetesPodOperator**.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org