You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Oscar Cassetti (Jira)" <ji...@apache.org> on 2020/06/13 16:34:00 UTC

[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

    [ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134862#comment-17134862 ] 

Oscar Cassetti commented on SPARK-26365:
----------------------------------------

I can see the same issue and I think it is due to this 

[https://github.com/apache/spark/blob/f535004e14b197ceb1f2108a67b033c052d65bcb/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L214]

 

and the  `io.fabric8.kubernetes.client.KubernetesClient`

The watcher 

Steps to reproduces 

 
{code:java}
spark-submit \
   --master k8s://https://172.17.0.2:8443 \
   --deploy-mode cluster \
   --name ocassetti-test \
   --conf spark.executor.instances=2 \
   --conf spark.kubernetes.namespace=spark \
   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
   --py-files https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/lib.zip \
   --conf spark.kubernetes.pyspark.pythonVersion="3" \
   --files https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/data.txt \
   --conf spark.kubernetes.container.image=gcr.io/spark-operator/spark-py:v2.4.5 \
   https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/main.py {code}



{code:java}
Container name: spark-kubernetes-driver  
Container name: spark-kubernetes-driver  
Container image: gcr.io/spark-operator/spark-py:v2.4.5  
Container state: Terminated  
Exit code: 1
20/06/14 00:29:48 INFO submit.Client: Application ocassetti-test finished.
20/06/14 00:29:48 INFO util.ShutdownHookManager: Shutdown hook called
20/06/14 00:29:48 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3924793f-9b83-4361-9491-c858f26ae9e0
 {code}

> spark-submit for k8s cluster doesn't propagate exit code
> --------------------------------------------------------
>
>                 Key: SPARK-26365
>                 URL: https://issues.apache.org/jira/browse/SPARK-26365
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core, Spark Submit
>    Affects Versions: 2.3.2, 2.4.0
>            Reporter: Oscar Bonilla
>            Priority: Minor
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark applications fails (returns exit code = 1 for example), spark-submit will still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem with the Spark application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org