You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Seth Horrigan (Jira)" <ji...@apache.org> on 2022/04/05 22:35:00 UTC

[jira] [Created] (SPARK-38794) When ConfigMap creation fails, Spark driver starts but fails to start executors

Seth Horrigan created SPARK-38794:
-------------------------------------

             Summary: When ConfigMap creation fails, Spark driver starts but fails to start executors
                 Key: SPARK-38794
                 URL: https://issues.apache.org/jira/browse/SPARK-38794
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes
    Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1
            Reporter: Seth Horrigan


When running Spark in Kubernetes client mode, all executors assume that a ConfigMap exactly matching `KubernetesClientUtils.configMapNameExecutor` will exist (see [https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L98])

If the ConfigMap creation fails, [https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L80], (due to the Kubernetes control plane being temporarily unavailable or the permissions of the serviceaccount being insufficient to create a ConfigMap), the driver will start fully, then will wait for executors that will forever fail to start due to "MountVolume.SetUp failed for volume \"spark-conf-volume-exec\" : configmap \"spark-exec-...-conf-map\" not found" 

 

Either the driver start-up should fail with an error, or the driver should retry the attempt to create the ConfigMap



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org