You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by rx...@apache.org on 2014/07/16 20:28:01 UTC

git commit: [SPARK-2522] set default broadcast factory to torrent

Repository: spark
Updated Branches:
  refs/heads/master ef48222c1 -> 96f28c972


[SPARK-2522] set default broadcast factory to torrent

HttpBroadcastFactory is the current default broadcast factory. It sends the broadcast data to each worker one by one, which is slow when the cluster is big. TorrentBroadcastFactory scales much better than http. Maybe we should make torrent the default broadcast method.

Author: Xiangrui Meng <me...@databricks.com>

Closes #1437 from mengxr/bt-broadcast and squashes the following commits:

ed492fe [Xiangrui Meng] set default broadcast factory to torrent


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/96f28c97
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/96f28c97
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/96f28c97

Branch: refs/heads/master
Commit: 96f28c9726d18f3b0d7a57b128c16ec9157f1532
Parents: ef48222
Author: Xiangrui Meng <me...@databricks.com>
Authored: Wed Jul 16 11:27:51 2014 -0700
Committer: Reynold Xin <rx...@apache.org>
Committed: Wed Jul 16 11:27:51 2014 -0700

----------------------------------------------------------------------
 .../main/scala/org/apache/spark/broadcast/BroadcastManager.scala   | 2 +-
 docs/configuration.md                                              | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/96f28c97/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala b/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala
index c88be6a..8f8a0b1 100644
--- a/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala
+++ b/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala
@@ -39,7 +39,7 @@ private[spark] class BroadcastManager(
     synchronized {
       if (!initialized) {
         val broadcastFactoryClass =
-          conf.get("spark.broadcast.factory", "org.apache.spark.broadcast.HttpBroadcastFactory")
+          conf.get("spark.broadcast.factory", "org.apache.spark.broadcast.TorrentBroadcastFactory")
 
         broadcastFactory =
           Class.forName(broadcastFactoryClass).newInstance.asInstanceOf[BroadcastFactory]

http://git-wip-us.apache.org/repos/asf/spark/blob/96f28c97/docs/configuration.md
----------------------------------------------------------------------
diff --git a/docs/configuration.md b/docs/configuration.md
index 9d3fe74..a70007c 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -419,7 +419,7 @@ Apart from these, the following properties are also available, and may be useful
 </tr>
 <tr>
   <td><code>spark.broadcast.factory</code></td>
-  <td>org.apache.spark.broadcast.<br />HttpBroadcastFactory</td>
+  <td>org.apache.spark.broadcast.<br />TorrentBroadcastFactory</td>
   <td>
     Which broadcast implementation to use.
   </td>