You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Elizabeth Keddy (JIRA)" <ji...@apache.org> on 2016/06/12 18:35:21 UTC

[jira] [Commented] (SPARK-15479) Spark job doesn't shut gracefully in yarn mode.

    [ https://issues.apache.org/jira/browse/SPARK-15479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326576#comment-15326576 ] 

Elizabeth Keddy commented on SPARK-15479:
-----------------------------------------

Are there instructions on how to gracefully shutdown the spark streaming application in yarn mode?

If I understand correctly, even if there is a way to directly tell the driver to shutdown,  yarn would restart it anyway. Does "spark-submit --kill" work on a yarn cluster? It doesn't appear that way in documentation.

In our setup, we want to be able to restart the application for upgrades.  In some cases, data loss is acceptable.  One of the biggest issues we have, however, is that the sparkStaging directory does not get properly cleaned up and eventually we run low on disk on the master node.

While I don't know what the original Reporter's motive was for this JIRA,  there appears to be a lot of questions out there on this very topic and few answers or documentation on how to achieve it.  Either Spark's documentation is lacking in this area or there is truly a bug/limitation here.

Thanks


> Spark job doesn't shut gracefully in yarn mode.
> -----------------------------------------------
>
>                 Key: SPARK-15479
>                 URL: https://issues.apache.org/jira/browse/SPARK-15479
>             Project: Spark
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 1.5.1
>            Reporter: Rakesh
>         Attachments: driver.rtf, executor.rtf
>
>
> Issue i am having is similar to the one mentioned here :
> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
> I am creating a rdd from sequence of 1 to 300 and creating streaming RDD out of it.
> val rdd = ssc.sparkContext.parallelize(1 to 300)
> val dstream = new ConstantInputDStream(ssc, rdd)
> dstream.foreachRDD{ rdd =>
>   rdd.foreach{ x =>
>     log(x)
>     Thread.sleep(50)
>   }
> }
> When i kill this job, i expect elements 1 to 300 to be logged before shutting down. It is indeed the case when i run it locally. It waits for the job to finish before shutting down.
> But when i launch the job in cluster with "yarn-cluster" mode, it abruptly shuts down.
> Executor prints following log
> ERROR executor.CoarseGrainedExecutorBackend: 
> Driver xx.xx.xx.xxx:yyyyy disassociated! Shutting down.
> and then it shuts down. It is not a graceful shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org