You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/09/22 20:11:34 UTC

[jira] [Commented] (SPARK-1655) In naive Bayes, store conditional probabilities distributively.

    [ https://issues.apache.org/jira/browse/SPARK-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143550#comment-14143550 ] 

Apache Spark commented on SPARK-1655:
-------------------------------------

User 'staple' has created a pull request for this issue:
https://github.com/apache/spark/pull/2491

> In naive Bayes, store conditional probabilities distributively.
> ---------------------------------------------------------------
>
>                 Key: SPARK-1655
>                 URL: https://issues.apache.org/jira/browse/SPARK-1655
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Xiangrui Meng
>
> In the current implementation, we collect all conditional probabilities to the driver node. When there are many labels and many features, this puts heavy load on the driver. For scalability, we should provide a way to store conditional probabilities distributively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org