You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by pw...@apache.org on 2014/03/09 19:39:00 UTC

git commit: Add timeout for fetch file

Repository: spark
Updated Branches:
  refs/heads/master 52834d761 -> f6f9d02e8


Add timeout for fetch file

    Currently, when fetch a file, the connection's connect timeout
    and read timeout is based on the default jvm setting, in this change, I change it to
    use spark.worker.timeout. This can be usefull, when the
    connection status between worker is not perfect. And prevent
    prematurely remove task set.

Author: Jiacheng Guo <gu...@gmail.com>

Closes #98 from guojc/master and squashes the following commits:

abfe698 [Jiacheng Guo] add space according request
2a37c34 [Jiacheng Guo] Add timeout for fetch file     Currently, when fetch a file, the connection's connect timeout     and read timeout is based on the default jvm setting, in this change, I change it to     use spark.worker.timeout. This can be usefull, when the     connection status between worker is not perfect. And prevent     prematurely remove task set.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f6f9d02e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f6f9d02e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f6f9d02e

Branch: refs/heads/master
Commit: f6f9d02e85d17da2f742ed0062f1648a9293e73c
Parents: 52834d7
Author: Jiacheng Guo <gu...@gmail.com>
Authored: Sun Mar 9 11:37:44 2014 -0700
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Sun Mar 9 11:38:40 2014 -0700

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/util/Utils.scala | 4 ++++
 docs/configuration.md                                 | 9 +++++++++
 2 files changed, 13 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/f6f9d02e/core/src/main/scala/org/apache/spark/util/Utils.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 53458b6..ac376fc 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -278,6 +278,10 @@ private[spark] object Utils extends Logging {
           uc = new URL(url).openConnection()
         }
 
+        val timeout = conf.getInt("spark.files.fetchTimeout", 60) * 1000
+        uc.setConnectTimeout(timeout)
+        uc.setReadTimeout(timeout)
+        uc.connect()
         val in = uc.getInputStream();
         val out = new FileOutputStream(tempFile)
         Utils.copyStream(in, out, true)

http://git-wip-us.apache.org/repos/asf/spark/blob/f6f9d02e/docs/configuration.md
----------------------------------------------------------------------
diff --git a/docs/configuration.md b/docs/configuration.md
index 913c653..8f6cb02 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -523,6 +523,15 @@ Apart from these, the following properties are also available, and may be useful
   <td>
     Whether to overwrite files added through SparkContext.addFile() when the target file exists and its contents do not match those of the source.
   </td>
+</tr>
+<tr>
+  <td>spark.files.fetchTimeout</td>
+  <td>false</td>
+  <td>
+    Communication timeout to use when fetching files added through SparkContext.addFile() from
+    the driver.
+  </td>
+</tr>
 <tr>  
   <td>spark.authenticate</td>
   <td>false</td>