You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Seth Horrigan (Jira)" <ji...@apache.org> on 2022/04/05 22:35:00 UTC
[jira] [Created] (SPARK-38794) When ConfigMap creation fails, Spark driver starts but fails to start executors
Seth Horrigan created SPARK-38794:
-------------------------------------
Summary: When ConfigMap creation fails, Spark driver starts but fails to start executors
Key: SPARK-38794
URL: https://issues.apache.org/jira/browse/SPARK-38794
Project: Spark
Issue Type: Bug
Components: Kubernetes
Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1
Reporter: Seth Horrigan
When running Spark in Kubernetes client mode, all executors assume that a ConfigMap exactly matching `KubernetesClientUtils.configMapNameExecutor` will exist (see [https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L98])
If the ConfigMap creation fails, [https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L80], (due to the Kubernetes control plane being temporarily unavailable or the permissions of the serviceaccount being insufficient to create a ConfigMap), the driver will start fully, then will wait for executors that will forever fail to start due to "MountVolume.SetUp failed for volume \"spark-conf-volume-exec\" : configmap \"spark-exec-...-conf-map\" not found"
Either the driver start-up should fail with an error, or the driver should retry the attempt to create the ConfigMap
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org