You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Martin Somers <so...@gmail.com> on 2016/07/21 22:50:08 UTC
SVD output within Spark
just looking at a comparision between Matlab and Spark for svd with an
input matrix N
this is matlab code - yes very small matrix!!!!
N =
2.5903 -0.0416 0.6023
-0.1236 2.5596 0.7629
0.0148 -0.0693 0.2490
U =
-0.3706 -0.9284 0.0273
-0.9287 0.3708 0.0014
-0.0114 -0.0248 -0.9996
------------------------
Spark code
// Breeze to spark
val N1D = N.reshape(1, 9).toArray
// Note I had to transpose array to get correct values with incorrect signs
val V2D = N1D.grouped(3).toArray.transpose
// Then convert the array into a RDD
// val NVecdis = Vectors.dense(N1D.map(x => x.toDouble))
// val V2D = N1D.grouped(3).toArray
val rowlocal = V2D.map{x => Vectors.dense(x)}
val rows = sc.parallelize(rowlocal)
val mat = new RowMatrix(rows)
val mat = new RowMatrix(rows)
val svd = mat.computeSVD(mat.numCols().toInt, computeU=true)
------------------------
Spark Output - notice the change in sign on the 2nd and 3rd column
-0.3158590633523746 0.9220516369164243 -0.22372713505049768
-0.8822050381939436 -0.3721920780944116 -0.28842213436035985
-0.34920956843045253 0.10627246051309004 0.9309988407367168
And finally some julia code
N = [2.59031 -0.0416335 0.602295;
-0.123584 2.55964 0.762906;
0.0148463 -0.0693119 0.249017]
svd(N, thin=true) --- same as matlab
-0.315859 -0.922052 0.223727
-0.882205 0.372192 0.288422
-0.34921 -0.106272 -0.930999
Most likely its an issue with my implementation rather than being a bug
with svd within the spark environment
My spark instance is running locally with a docker container
Any suggestions
tks
Re: SVD output within Spark
Posted by Yanbo Liang <yb...@gmail.com>.
The signs of the eigenvectors are essentially arbitrary, so both result of
Spark and Matlab are right.
Thanks
On Thu, Jul 21, 2016 at 3:50 PM, Martin Somers <so...@gmail.com> wrote:
>
> just looking at a comparision between Matlab and Spark for svd with an
> input matrix N
>
>
> this is matlab code - yes very small matrix!!!!
>
> N =
>
> 2.5903 -0.0416 0.6023
> -0.1236 2.5596 0.7629
> 0.0148 -0.0693 0.2490
>
>
>
> U =
>
> -0.3706 -0.9284 0.0273
> -0.9287 0.3708 0.0014
> -0.0114 -0.0248 -0.9996
>
> ------------------------
> Spark code
>
> // Breeze to spark
> val N1D = N.reshape(1, 9).toArray
>
>
> // Note I had to transpose array to get correct values with incorrect signs
> val V2D = N1D.grouped(3).toArray.transpose
>
>
> // Then convert the array into a RDD
> // val NVecdis = Vectors.dense(N1D.map(x => x.toDouble))
> // val V2D = N1D.grouped(3).toArray
>
>
> val rowlocal = V2D.map{x => Vectors.dense(x)}
> val rows = sc.parallelize(rowlocal)
> val mat = new RowMatrix(rows)
> val mat = new RowMatrix(rows)
> val svd = mat.computeSVD(mat.numCols().toInt, computeU=true)
>
> ------------------------
>
> Spark Output - notice the change in sign on the 2nd and 3rd column
> -0.3158590633523746 0.9220516369164243 -0.22372713505049768
> -0.8822050381939436 -0.3721920780944116 -0.28842213436035985
> -0.34920956843045253 0.10627246051309004 0.9309988407367168
>
>
>
> And finally some julia code
> N = [2.59031 -0.0416335 0.602295;
> -0.123584 2.55964 0.762906;
> 0.0148463 -0.0693119 0.249017]
>
> svd(N, thin=true) --- same as matlab
> -0.315859 -0.922052 0.223727
> -0.882205 0.372192 0.288422
> -0.34921 -0.106272 -0.930999
>
> Most likely its an issue with my implementation rather than being a bug
> with svd within the spark environment
> My spark instance is running locally with a docker container
> Any suggestions
> tks
>
>