You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matt Cheah (JIRA)" <ji...@apache.org> on 2015/04/15 16:47:00 UTC

[jira] [Resolved] (SPARK-5697) Allow Spark driver to wait longer before giving up connecting to the master

     [ https://issues.apache.org/jira/browse/SPARK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Cheah resolved SPARK-5697.
-------------------------------
    Resolution: Won't Fix

> Allow Spark driver to wait longer before giving up connecting to the master
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-5697
>                 URL: https://issues.apache.org/jira/browse/SPARK-5697
>             Project: Spark
>          Issue Type: Improvement
>          Components: Deploy
>    Affects Versions: 1.1.1, 1.2.0
>            Reporter: Matt Cheah
>
> In the AppClient class, the driver is configured to attempt connecting to the master 3 times, with 20 second gaps, before giving up and killing the job.
> In reality, some clusters may have high amounts of traffic and resource contention, and in such environments jobs may wish to wait longer before giving up. This reduces the user's overhead of needing to resubmit jobs that simply had to wait for too long. An unreliable busy network may also cause messages to take a longer time to propagate.
> I suggest simply allowing the timeout and the number of retries for driver registration to be configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org