You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chris Harvey (JIRA)" <ji...@apache.org> on 2015/07/02 19:54:05 UTC

[jira] [Commented] (SPARK-7210) Test matrix decompositions for speed vs. numerical stability for Gaussians

    [ https://issues.apache.org/jira/browse/SPARK-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612291#comment-14612291 ] 

Chris Harvey commented on SPARK-7210:
-------------------------------------

I am new to the Apache Spark project but I would like to contribute to this issue. 

Feynman posted an R recipe for computing the pdf using a Cholesky trick. I would like to compute the pdf by following that recipe while using the Cholesky implementation found in Scalanlp Breeze. To test speed I would estimate the pdf using the original method and the Cholesky method across a range of simulated datasets with growing n and p. To test stability I would estimate the pdf on simulated features with some multicollinearity. 

Does this sound like a good starting point? Given that this is my first attempt at contributing to an Apache project, might it be a good idea to do this through the Mentor Programme? 

> Test matrix decompositions for speed vs. numerical stability for Gaussians
> --------------------------------------------------------------------------
>
>                 Key: SPARK-7210
>                 URL: https://issues.apache.org/jira/browse/SPARK-7210
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> We currently use SVD for inverting the Gaussian's covariance matrix and computing the determinant.  SVD is numerically stable but slow.  We could experiment with Cholesky, etc. to figure out a better option, or a better option for certain settings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org