You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/27 19:56:11 UTC

[GitHub] [airflow] SamWheating opened a new issue #21169: `is_delete_operator_pod=True` and `random_name_suffix=False` can cause KubernetesPodOperator to delete the wrong pod

SamWheating opened a new issue #21169:
URL: https://github.com/apache/airflow/issues/21169


   ### Apache Airflow version
   
   2.2.2
   
   ### What happened
   
   When running multiple KubernetesPodOperators with `random_name_suffix=False` and `is_delete_pod_operator=True` the following will happen:
   
   1) The first task will create the Pod `my-pod`
   2) The second task will attempt to create the pod, but fail with a 409 response from the API server (this is expected)
   3) The second task will delete `my-pod`, because it has `is_delete_pod_operator=True` and the Pod name is consistent between the two tasks. This is unexpected and will cause the first task to fail as well. 
   
   I understand that this is a rare circumstance, but I think its still worth fixing as anyone using `random_name_suffix=False` in an otherwise default KubernetesPodOperator may result in other pods being killed. 
   
   As a possible fix, we could [`find_pod`](https://github.com/apache/airflow/blob/684fe46158aa3d6cb2de245d29e20e487d8f2158/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L322) before deleting, to ensure that the pod being deleted has the appropriate `execution_date` label:
   https://github.com/apache/airflow/blob/ad07923606262ef8a650dcead38183da6bbb5d7b/airflow/providers/cncf/kubernetes/utils/pod_launcher.py#L103-L112
   
   Let me know if you have any other suggestions for how this could be fixed, or if this should just be considered expected behaviour when using fixed Kubernetes Pod IDs.
   
   ### What you expected to happen
   
   The second task should be able to fail without deleting the pod from the first task. 
   
   ### How to reproduce
   
   Create a DAG with a single KubernetesPodOperator with `random_name_suffix=False` and `is_delete_pod_operator=True` and run it twice in parallel. 
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-cncf-kubernetes=2.2.0
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on issue #21169: `is_delete_operator_pod=True` and `random_name_suffix=False` can cause KubernetesPodOperator to delete the wrong pod

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on issue #21169:
URL: https://github.com/apache/airflow/issues/21169#issuecomment-1023626676


   Interesting combination of events.  Another possible solution might be to check the failure cause before calling delete_pod, if it failed because the pod already existed, then don't delete it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org