You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/04/23 14:45:38 UTC

[jira] [Commented] (SPARK-6924) driver hangs when net is broken

    [ https://issues.apache.org/jira/browse/SPARK-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508989#comment-14508989 ] 

Apache Spark commented on SPARK-6924:
-------------------------------------

User 'SaintBacchus' has created a pull request for this issue:
https://github.com/apache/spark/pull/5663

> driver hangs when net is broken
> -------------------------------
>
>                 Key: SPARK-6924
>                 URL: https://issues.apache.org/jira/browse/SPARK-6924
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: xukun
>
> In yarn-client mode, client is deployed out side of cluster. When the net between client and cluster is broken, driver lost all executors. In normal situation, client returns and app fails. Actually, the driver hangs, user do not know  whether app is ok. So we should let driver return not hang.
> The solution: in HeartbeatReceiver thread, check whether some executor send heartbeat to dirver at the fixed rate. If no execuor send heartbeats to driver, close SparkContext. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org