You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2014/06/18 01:43:08 UTC

[jira] [Updated] (SPARK-2174) Implement treeReduce and treeAggregate

     [ https://issues.apache.org/jira/browse/SPARK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiangrui Meng updated SPARK-2174:
---------------------------------

    Description: 
In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big.

SPARK-1485 tracks the progress of implementing AllReduce on Spark. I did several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n -> sqrt(n) -> 1 gives the best performance in general. But it certainly needs more testing.

  was:
In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big.

SPARK-1485 tracks the progress of implementing AllReduce on Spark. I didn't several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n -> sqrt(n) -> 1 gives the best performance in general. But it certainly needs more testing.


> Implement treeReduce and treeAggregate
> --------------------------------------
>
>                 Key: SPARK-2174
>                 URL: https://issues.apache.org/jira/browse/SPARK-2174
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib, Spark Core
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>
> In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big.
> SPARK-1485 tracks the progress of implementing AllReduce on Spark. I did several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n -> sqrt(n) -> 1 gives the best performance in general. But it certainly needs more testing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)