You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kay Ousterhout (JIRA)" <ji...@apache.org> on 2015/07/31 18:36:04 UTC

[jira] [Resolved] (SPARK-9509) AppClient.stop() may throw an exception

     [ https://issues.apache.org/jira/browse/SPARK-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Ousterhout resolved SPARK-9509.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.0

> AppClient.stop() may throw an exception
> ---------------------------------------
>
>                 Key: SPARK-9509
>                 URL: https://issues.apache.org/jira/browse/SPARK-9509
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Kay Ousterhout
>            Assignee: Shixiong Zhu
>             Fix For: 1.5.0
>
>
> AppClient.stop() calls RPCEndpointRef.askWithRetry, which throws a SparkException if it fails.  This exception is not caught (stop() only catches timeout exceptions) which can lead to a failure during shutdown, causing Spark not to clean itself up properly.  This behavior was changed in this commit: https://github.com/apache/spark/commit/3bee0f1466ddd69f26e95297b5e0d2398b6c6268#diff-a240aa7b4630dc389590147f96cf3431R174, and this seems to be the root cause of the recent Distributed Suite test failures described in SPARK-9497 (the flakiness of DistributedSuite coincides with when the above commit was added to master).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org