You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/04/23 17:04:39 UTC

[jira] [Reopened] (SPARK-6924) driver hangs when net is broken

     [ https://issues.apache.org/jira/browse/SPARK-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen reopened SPARK-6924:
------------------------------

Reopening because there is a PR now

> driver hangs when net is broken
> -------------------------------
>
>                 Key: SPARK-6924
>                 URL: https://issues.apache.org/jira/browse/SPARK-6924
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: xukun
>
> In yarn-client mode, client is deployed out side of cluster. When the net between client and cluster is broken, driver lost all executors. In normal situation, client returns and app fails. Actually, the driver hangs, user do not know  whether app is ok. So we should let driver return not hang.
> The solution: in HeartbeatReceiver thread, check whether some executor send heartbeat to dirver at the fixed rate. If no execuor send heartbeats to driver, close SparkContext. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org