You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yinan Li (JIRA)" <ji...@apache.org> on 2018/09/06 16:39:00 UTC

[jira] [Issue Comment Deleted] (SPARK-25295) Pod names conflicts in client mode, if previous submission was not a clean shutdown.

     [ https://issues.apache.org/jira/browse/SPARK-25295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yinan Li updated SPARK-25295:
-----------------------------
    Comment: was deleted

(was: We made it clear in the documentation of the Kubernetes mode at [https://github.com/apache/spark/blob/master/docs/running-on-kubernetes.md#client-mode-executor-pod-garbage-collection] that when running the client mode, executor pods may be left behind. This is by design. If you want to have the executor pods deleted automatically, run the driver in a pod inside the cluster and set {{spark.driver.pod.name}} to the name of the driver pod so an {{OwnerReference}} pointing to the driver pod gets added to the executor pods. This way the executor pods get garbage collected when the driver pod is gone.)

> Pod names conflicts in client mode, if previous submission was not a clean shutdown.
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-25295
>                 URL: https://issues.apache.org/jira/browse/SPARK-25295
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Prashant Sharma
>            Priority: Major
>
> If the previous job was killed somehow, by disconnecting the client. It leaves behind the executor pods named spark-exec-#, which cause naming conflicts and failures for the next job submission.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://<ip>:6443/api/v1/namespaces/default/pods. Message: pods "spark-exec-4" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=pods, name=spark-exec-4, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods "spark-exec-4" already exists, metadata=ListMeta(resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org