You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2022/12/14 05:49:00 UTC

[jira] [Updated] (SPARK-39965) Skip PVC cleanup when driver doesn't own PVCs

     [ https://issues.apache.org/jira/browse/SPARK-39965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun updated SPARK-39965:
----------------------------------
        Parent: SPARK-41515
    Issue Type: Sub-task  (was: Bug)

> Skip PVC cleanup when driver doesn't own PVCs
> ---------------------------------------------
>
>                 Key: SPARK-39965
>                 URL: https://issues.apache.org/jira/browse/SPARK-39965
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes
>    Affects Versions: 3.3.0
>            Reporter: Pralabh Kumar
>            Assignee: Pralabh Kumar
>            Priority: Trivial
>             Fix For: 3.3.1, 3.2.3, 3.4.0
>
>
> From Spark32 . as a part of [https://github.com/apache/spark/pull/32288] , functionality is added to delete PVC if the Spark driver died. 
> [https://github.com/apache/spark/blob/786a70e710369b195d7c117b33fe9983044014d6/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L144]
>  
> However there are cases , where spark on K8s doesn't use PVC and use host path for storage. 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes]
>  
> Now  in those cases ,
>  * it request to delete PVC (which is not required) .
>  * It also tries to delete in the case where driver doesn't own the PV (or spark.kubernetes.driver.ownPersistentVolumeClaim is false) 
>  * Moreover in the cluster , where Spark user doesn't have access to list or delete PVC , it throws exception .  
>  
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: [https://kubernetes.default.svc/api/v1/namespaces/<>/persistentvolumeclaims?labelSelector=spark-app-selector%3Dspark-332bd09284b3442f8a6a214fabcd6ab1|https://kubernetes.default.svc/api/v1/namespaces/dpi-dev/persistentvolumeclaims?labelSelector=spark-app-selector%3Dspark-332bd09284b3442f8a6a214fabcd6ab1]. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. persistentvolumeclaims is forbidden: User "system:serviceaccount:dpi-dev:spark" cannot list resource "persistentvolumeclaims" in API group "" in the namespace "<>".
>  
> *Solution*
> Ideally there should be configuration spark.kubernetes.driver.pvc.deleteOnTermination or use spark.kubernetes.driver.ownPersistentVolumeClaim  which should be checked before calling to delete PVC. If user have not set up PV or if the driver doesn't own  then there is no need to call the api and delete PVC . 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org