You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Petri (Jira)" <ji...@apache.org> on 2022/01/26 09:21:00 UTC

[jira] [Closed] (SPARK-37537) Spark 3.2.0 driver pod does not mount checkpoint filesystem from Kubernetes PVC

     [ https://issues.apache.org/jira/browse/SPARK-37537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Petri closed SPARK-37537.
-------------------------

> Spark 3.2.0 driver pod does not mount checkpoint filesystem from Kubernetes PVC
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-37537
>                 URL: https://issues.apache.org/jira/browse/SPARK-37537
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 3.2.0
>            Reporter: Petri
>            Priority: Major
>
> I have Spark 3.2.0 driver executing in Kubernetes pod in client mode and following configs has been defined in spark-submit:
> {code:java}
> --deploy-mode client
> --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk
> --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.readOnly=false
> --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp-conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk
> --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.readOnly=false
> --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp
>   {code}
> I face a problem when starting the driver pod that it cannot access the filesystem mounted from GlusterFS PVC. I can see that driver pod has not mounted the PVC when describing the pod. I can also see that PVC is not mounted when describing the PVC.
> This has been working with Spark version 2.4.x, but not with Spark 3.2.0.
> Only notable change we have between using Spark version 2.4.x and 3.2.0 is that in 2.4.x we used deploy-mode cluster and in 3.2.0 we use deploy-mode client.
>  
> Because the filesystem used for checkpointing is not mounted properly, we get following kind of error in our application:
> {code:java}
> java.io.FileNotFoundException: File /mnt/distributedDisk/SE/LolaStreamingApp/1.0.0/1468589949 does not exist
>         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779) ~[hadoop-client-api-3.3.1.jar:?]
>         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100) ~[hadoop-client-api-3.3.1.jar:?]
>         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769) ~[hadoop-client-api-3.3.1.jar:?]
>         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462) ~[hadoop-client-api-3.3.1.jar:?]
>         at org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:240) ~[spark-streaming_2.12-3.2.0.jar:3.2.0]
>         at org.apache.spark.streaming.api.java.JavaStreamingContext.checkpoint(JavaStreamingContext.scala:509) ~[spark-streaming_2.12-3.2.0.jar:3.2.0] {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org