You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by wxhsdp <wx...@gmail.com> on 2014/04/12 03:12:45 UTC

SVD under spark/mllib/linalg

Hi, all
the code under
https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib/linalg
has changed. previous matrix classes are all removed, like MatrixEntry,
MatrixSVD. Instead breeze matrix definition appears. Do we move to Breeze
Linear Algebra when do linear algorithm?

another question, are there any matrix multiplication optimized codes in
spark? 
i only see the outer product method in the removed SVD.scala

// Compute A^T A, assuming rows are sparse enough to fit in memory
val rows = data.map(entry =>
		(entry.i, (entry.j, entry.mval))).groupByKey()
val emits = rows.flatMap{ case (rowind, cols)  =>
  cols.flatMap{ case (colind1, mval1) =>
				cols.map{ case (colind2, mval2) =>
						((colind1, colind2), mval1*mval2) } }//colind1: col index, colind2:
row index
}.reduceByKey(_ + _)

thank you!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SVD-under-spark-mllib-linalg-tp4156.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: SVD under spark/mllib/linalg

Posted by Xiangrui Meng <me...@gmail.com>.
It was moved to mllib.linalg.distributed.RowMatrix. With RowMatrix,
you can compute column summary statistics, gram matrix, covariance,
SVD, and PCA. We will provide multiplication for distributed matrices,
but not in v1.0. -Xiangrui

On Fri, Apr 11, 2014 at 9:12 PM, wxhsdp <wx...@gmail.com> wrote:
> Hi, all
> the code under
> https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib/linalg
> has changed. previous matrix classes are all removed, like MatrixEntry,
> MatrixSVD. Instead breeze matrix definition appears. Do we move to Breeze
> Linear Algebra when do linear algorithm?
>
> another question, are there any matrix multiplication optimized codes in
> spark?
> i only see the outer product method in the removed SVD.scala
>
> // Compute A^T A, assuming rows are sparse enough to fit in memory
> val rows = data.map(entry =>
>                 (entry.i, (entry.j, entry.mval))).groupByKey()
> val emits = rows.flatMap{ case (rowind, cols)  =>
>   cols.flatMap{ case (colind1, mval1) =>
>                                 cols.map{ case (colind2, mval2) =>
>                                                 ((colind1, colind2), mval1*mval2) } }//colind1: col index, colind2:
> row index
> }.reduceByKey(_ + _)
>
> thank you!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SVD-under-spark-mllib-linalg-tp4156.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.