You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Martin Somers <so...@gmail.com> on 2016/07/21 22:50:08 UTC

SVD output within Spark

just looking at a comparision between Matlab and Spark for svd with an
input matrix N


this is matlab code - yes very small matrix!!!!

N =

    2.5903   -0.0416    0.6023
   -0.1236    2.5596    0.7629
    0.0148   -0.0693    0.2490



U =

   -0.3706   -0.9284    0.0273
   -0.9287    0.3708    0.0014
   -0.0114   -0.0248   -0.9996

------------------------
Spark code

// Breeze to spark
val N1D = N.reshape(1, 9).toArray


// Note I had to transpose array to get correct values with incorrect signs
val V2D = N1D.grouped(3).toArray.transpose


// Then convert the array into a RDD
// val NVecdis = Vectors.dense(N1D.map(x => x.toDouble))
// val V2D = N1D.grouped(3).toArray


val rowlocal = V2D.map{x => Vectors.dense(x)}
val rows = sc.parallelize(rowlocal)
val mat = new RowMatrix(rows)
val mat = new RowMatrix(rows)
val svd = mat.computeSVD(mat.numCols().toInt, computeU=true)

------------------------

Spark Output - notice the change in sign on the 2nd and 3rd column
-0.3158590633523746   0.9220516369164243   -0.22372713505049768
-0.8822050381939436   -0.3721920780944116  -0.28842213436035985
-0.34920956843045253  0.10627246051309004  0.9309988407367168



And finally some julia code
N  = [2.59031    -0.0416335  0.602295;
-0.123584    2.55964    0.762906;
0.0148463  -0.0693119  0.249017]

svd(N, thin=true)   --- same as matlab
-0.315859  -0.922052   0.223727
-0.882205   0.372192   0.288422
-0.34921   -0.106272  -0.930999

Most likely its an issue with my implementation rather than being a bug
with svd within the spark environment
My spark instance is running locally with a docker container
Any suggestions
tks

Re: SVD output within Spark

Posted by Yanbo Liang <yb...@gmail.com>.
The signs of the eigenvectors are essentially arbitrary, so both result of
Spark and Matlab are right.

Thanks

On Thu, Jul 21, 2016 at 3:50 PM, Martin Somers <so...@gmail.com> wrote:

>
> just looking at a comparision between Matlab and Spark for svd with an
> input matrix N
>
>
> this is matlab code - yes very small matrix!!!!
>
> N =
>
>     2.5903   -0.0416    0.6023
>    -0.1236    2.5596    0.7629
>     0.0148   -0.0693    0.2490
>
>
>
> U =
>
>    -0.3706   -0.9284    0.0273
>    -0.9287    0.3708    0.0014
>    -0.0114   -0.0248   -0.9996
>
> ------------------------
> Spark code
>
> // Breeze to spark
> val N1D = N.reshape(1, 9).toArray
>
>
> // Note I had to transpose array to get correct values with incorrect signs
> val V2D = N1D.grouped(3).toArray.transpose
>
>
> // Then convert the array into a RDD
> // val NVecdis = Vectors.dense(N1D.map(x => x.toDouble))
> // val V2D = N1D.grouped(3).toArray
>
>
> val rowlocal = V2D.map{x => Vectors.dense(x)}
> val rows = sc.parallelize(rowlocal)
> val mat = new RowMatrix(rows)
> val mat = new RowMatrix(rows)
> val svd = mat.computeSVD(mat.numCols().toInt, computeU=true)
>
> ------------------------
>
> Spark Output - notice the change in sign on the 2nd and 3rd column
> -0.3158590633523746   0.9220516369164243   -0.22372713505049768
> -0.8822050381939436   -0.3721920780944116  -0.28842213436035985
> -0.34920956843045253  0.10627246051309004  0.9309988407367168
>
>
>
> And finally some julia code
> N  = [2.59031    -0.0416335  0.602295;
> -0.123584    2.55964    0.762906;
> 0.0148463  -0.0693119  0.249017]
>
> svd(N, thin=true)   --- same as matlab
> -0.315859  -0.922052   0.223727
> -0.882205   0.372192   0.288422
> -0.34921   -0.106272  -0.930999
>
> Most likely its an issue with my implementation rather than being a bug
> with svd within the spark environment
> My spark instance is running locally with a docker container
> Any suggestions
> tks
>
>