You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Vincent (JIRA)" <ji...@apache.org> on 2017/06/10 13:47:18 UTC

[jira] [Commented] (SPARK-21049) why do we need computeGramianMatrix when computing SVD

    [ https://issues.apache.org/jira/browse/SPARK-21049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045536#comment-16045536 ] 

Vincent commented on SPARK-21049:
---------------------------------

[~srowen]thanks. that's right. But we found it quite often that, the matrix is not skinny, and it spent quite a lot of time computing gramian matrix. Actually, we found that in such case, if we compute the svd on the original matrix, we could at least have 5x+ speedup. So, I wonder, whether it's possible to add an option here, to offer the user a choice to choose whether go with gramian or the original matrix. After all, user knows their data better, what do u think?

> why do we need computeGramianMatrix when computing SVD
> ------------------------------------------------------
>
>                 Key: SPARK-21049
>                 URL: https://issues.apache.org/jira/browse/SPARK-21049
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, MLlib
>    Affects Versions: 2.1.1
>            Reporter: Vincent
>
> computeSVD will compute SVD for matrix A by computing AT*A first and svd on the Gramian matrix, we found that the gramian matrix computation is the hot spot of the overall SVD computation, but, per my understanding, we can simply do svd on the original matrix. The singular vector of the gramian matrix should be the same as the right singular vector of the original matrix A, while the singular value of the gramian matrix is double as that of the original matrix. why do we svd on the gramian matrix then?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org