You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chris Harvey (JIRA)" <ji...@apache.org> on 2015/07/02 19:54:05 UTC
[jira] [Commented] (SPARK-7210) Test matrix decompositions for
speed vs. numerical stability for Gaussians
[ https://issues.apache.org/jira/browse/SPARK-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612291#comment-14612291 ]
Chris Harvey commented on SPARK-7210:
-------------------------------------
I am new to the Apache Spark project but I would like to contribute to this issue.
Feynman posted an R recipe for computing the pdf using a Cholesky trick. I would like to compute the pdf by following that recipe while using the Cholesky implementation found in Scalanlp Breeze. To test speed I would estimate the pdf using the original method and the Cholesky method across a range of simulated datasets with growing n and p. To test stability I would estimate the pdf on simulated features with some multicollinearity.
Does this sound like a good starting point? Given that this is my first attempt at contributing to an Apache project, might it be a good idea to do this through the Mentor Programme?
> Test matrix decompositions for speed vs. numerical stability for Gaussians
> --------------------------------------------------------------------------
>
> Key: SPARK-7210
> URL: https://issues.apache.org/jira/browse/SPARK-7210
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Joseph K. Bradley
> Priority: Minor
>
> We currently use SVD for inverting the Gaussian's covariance matrix and computing the determinant. SVD is numerically stable but slow. We could experiment with Cholesky, etc. to figure out a better option, or a better option for certain settings.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org