You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Reza Zadeh (JIRA)" <ji...@apache.org> on 2014/11/25 07:36:12 UTC

[jira] [Commented] (SPARK-4590) Early investigation of parameter server

    [ https://issues.apache.org/jira/browse/SPARK-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224117#comment-14224117 ] 

Reza Zadeh commented on SPARK-4590:
-----------------------------------

Some starting points: 
- http://stanford.edu/~rezab/papers/factorbird.pdf
- http://parameterserver.org/

More detailed comparisons coming.


> Early investigation of parameter server
> ---------------------------------------
>
>                 Key: SPARK-4590
>                 URL: https://issues.apache.org/jira/browse/SPARK-4590
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML, MLlib
>            Reporter: Xiangrui Meng
>            Assignee: Reza Zadeh
>
> In the currently implementation of GLM solvers, we save intermediate models on the driver node and update it through broadcast and aggregation. Even with torrent broadcast and tree aggregation added in 1.1, it is hard to go beyond ~10 million features. This JIRA is for investigating the parameter server approach, including algorithm, infrastructure, and dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org