You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2014/11/07 05:37:33 UTC

[jira] [Updated] (SPARK-3920) Add option to support aggregation using treeAggregate in decision tree

     [ https://issues.apache.org/jira/browse/SPARK-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiangrui Meng updated SPARK-3920:
---------------------------------
    Fix Version/s:     (was: 1.2.0)

> Add option to support aggregation using treeAggregate in decision tree
> ----------------------------------------------------------------------
>
>                 Key: SPARK-3920
>                 URL: https://issues.apache.org/jira/browse/SPARK-3920
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Qiping Li
>
> In [SPARK-3366|https://issues.apache.org/jira/browse/SPARK-3366], we used distribute aggregation to aggregate node stats, which can save computation and communication time when the shuffle size is very large. But experiments have shown that if shuffle size is not large enough(e.g, shallow trees), this will cause some performance loss(greater than 20% in some cases). We should support both options for aggregation so that user can choose a proper one based on their needs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org