You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Martin Buchleitner (JIRA)" <ji...@apache.org> on 2018/12/06 10:16:00 UTC

[jira] [Created] (SPARK-26290) [K8s] Driver Pods no mounted volumes

Martin Buchleitner created SPARK-26290:
------------------------------------------

             Summary: [K8s] Driver Pods no mounted volumes
                 Key: SPARK-26290
                 URL: https://issues.apache.org/jira/browse/SPARK-26290
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes
    Affects Versions: 2.4.0
         Environment: Kuberentes 1.10.6
Spark 2.4.0

Spark containers are built from the archive served by [www.apache.org/dist/spark/|http://www.apache.org/dist/spark/] 
            Reporter: Martin Buchleitner


I want to use the volume feature to mount an existing PVC as readonly volume into the driver and also executor. 

The executor gets the PVC mounted, but the driver is missing the mount 
{code:java}
/opt/spark/bin/spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--conf spark.app.name=spark-pi \
--conf spark.executor.instances=4 \
--conf spark.kubernetes.namespace=spark-demo \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.kubernetes.container.image=kube-spark:2.4.0 \
--conf spark.master=k8s://https://<master-ip> \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.path=/srv \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.readOnly=true \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.options.claimName=nfs-pvc \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path=/srv \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=true \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName=nfs-pvc \
/srv/spark-examples_2.11-2.4.0.jar
{code}
When i use the jar included in the container
{code:java}
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
{code}
the call works and i can review the pod descriptions to review the behavior

*Driver description*
{code:java}
Name:         spark-pi-1544018157391-driver
[...]
Containers:
  spark-kubernetes-driver:
    Container ID:   docker://3a31d867c140183247cb296e13a8b35d03835f7657dd7e625c59083024e51e28
    Image:          kube-spark:2.4.0
    Image ID:       [...]
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 05 Dec 2018 14:55:59 +0100
      Finished:     Wed, 05 Dec 2018 14:56:08 +0100
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  1408Mi
    Requests:
      cpu:     1
      memory:  1Gi
    Environment:
      SPARK_DRIVER_MEMORY:        1g
      SPARK_DRIVER_CLASS:         org.apache.spark.examples.SparkPi
      SPARK_DRIVER_ARGS:
      SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
      SPARK_MOUNTED_CLASSPATH:    /opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
      SPARK_JAVA_OPT_1:           -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path=/srv
      SPARK_JAVA_OPT_3:           -Dspark.app.name=spark-pi
      SPARK_JAVA_OPT_4:           -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.path=/srv
      SPARK_JAVA_OPT_5:           -Dspark.submit.deployMode=cluster
      SPARK_JAVA_OPT_6:           -Dspark.driver.blockManager.port=7079
      SPARK_JAVA_OPT_7:           -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.readOnly=true
      SPARK_JAVA_OPT_8:           -Dspark.kubernetes.authenticate.driver.serviceAccountName=spark
      SPARK_JAVA_OPT_9:           -Dspark.driver.host=spark-pi-1544018157391-driver-svc.spark-demo.svc.cluster.local
      SPARK_JAVA_OPT_10:          -Dspark.kubernetes.driver.pod.name=spark-pi-1544018157391-driver
      SPARK_JAVA_OPT_11:          -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.options.claimName=nfs-pvc
      SPARK_JAVA_OPT_12:          -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=true
      SPARK_JAVA_OPT_13:          -Dspark.driver.port=7078
      SPARK_JAVA_OPT_14:          -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
      SPARK_JAVA_OPT_15:          -Dspark.kubernetes.executor.podNamePrefix=spark-pi-1544018157391
      SPARK_JAVA_OPT_16:          -Dspark.local.dir=/tmp/spark-local
      SPARK_JAVA_OPT_17:          -Dspark.master=k8s://https://<master-ip>
      SPARK_JAVA_OPT_18:          -Dspark.app.id=spark-89420bd5fa8948c3aa9d14a4eb6ecfca
      SPARK_JAVA_OPT_19:          -Dspark.kubernetes.namespace=spark-demo
      SPARK_JAVA_OPT_21:          -Dspark.executor.instances=4
      SPARK_JAVA_OPT_22:          -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName=nfs-pvc
      SPARK_JAVA_OPT_23:          -Dspark.kubernetes.container.image=kube-spark:2.4.0
      SPARK_JAVA_OPT_24:          -Dspark.kubernetes.container.image.pullPolicy=Always
    Mounts:
      /tmp/spark-local from spark-local-dir-0-spark-local (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from spark-token-nhcdd (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  spark-local-dir-0-spark-local:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  spark-token-nhcdd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  spark-token-nhcdd
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>
{code}
*Executor description:*
{code:java}
Name:                      spark-pi-1544018157391-exec-2
[..]
Controlled By:             Pod/spark-pi-1544018157391-driver
Containers:
  executor:
    Container ID:  docker://053256f023805a0a2fa580815f78203d2a32b0bc4e8e17741f45d84dd20a5e44
    Image:         kube-spark:2.4.0
    Image ID:       [...]
    Port:          7079/TCP
    Host Port:     0/TCP
    Args:
      executor
    State:          Running
      Started:      Wed, 05 Dec 2018 14:56:04 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  1408Mi
    Requests:
      cpu:     1
      memory:  1408Mi
    Environment:
      SPARK_DRIVER_URL:       spark://CoarseGrainedScheduler@spark-pi-1544018157391-driver-svc.spark-demo.svc.cluster.local:7078
      SPARK_EXECUTOR_CORES:   1
      SPARK_EXECUTOR_MEMORY:  1g
      SPARK_APPLICATION_ID:   spark-application-1544018162183
      SPARK_CONF_DIR:         /opt/spark/conf
      SPARK_EXECUTOR_ID:      2
      SPARK_EXECUTOR_POD_IP:   (v1:status.podIP)
      SPARK_LOCAL_DIRS:       /tmp/spark-local
    Mounts:
      /srv from data (ro)
      /tmp/spark-local from spark-local-dir-1 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-5srsx (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  spark-local-dir-1:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nfs-pvc
    ReadOnly:   true
  default-token-5srsx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-5srsx
    Optional:    false
{code}
I also tried to use hostPath but it reflected the same behavior :( 
 I also reviewed the code which is doing those jobs and tried to find all available parameters, but there are not any parameters available except the subPath options. The Code of executor and driver looks exactly the same from my point of view.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org