You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/12/14 11:23:58 UTC

[jira] [Commented] (SPARK-18471) In treeAggregate, generate (big) zeros instead of sending them.

    [ https://issues.apache.org/jira/browse/SPARK-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748079#comment-15748079 ] 

Apache Spark commented on SPARK-18471:
--------------------------------------

User 'AnthonyTruchet' has created a pull request for this issue:
https://github.com/apache/spark/pull/16279

> In treeAggregate, generate (big) zeros instead of sending them.
> ---------------------------------------------------------------
>
>                 Key: SPARK-18471
>                 URL: https://issues.apache.org/jira/browse/SPARK-18471
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib, Spark Core
>            Reporter: Anthony Truchet
>            Assignee: Anthony Truchet
>            Priority: Minor
>             Fix For: 2.2.0
>
>
> When using optimization routine like LBFGS, treeAggregate curently sends the zero vector as part of the closure. This zero can be huge (e.g. ML vectors with millions of zeros) but can be easily generated.
> Several option are possible (upcoming patches to come soon for some of them).
> On is to provide a treeAggregateWithZeroGenerator method (either in core on in MLlib) which wrap treeAggregate in an option and generate the zero if None.
> Another one is to rewrite treeAggregate to wrap an underlying implementation which use a zero generator directly.
> There might be other better alternative we have not spotted...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org