You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by satish verma <sa...@gmail.com> on 2012/09/14 11:44:02 UTC

Explain SVD Output File Data

Hi

 I am using Mahout for SVD [Mat = U*Sigma*tranpose(V)] Decomposition as
follows

hadoop jar $mahoutjar
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver -i
/tmp/satish/matSVD.mat -o /tmp/satish/svdSmall4_5_5 -nr 4 -nc 5 -r 5

 Q1. Its a small matrix for test purposes. It has 4 rows and 5 colums. When
I use -r =5 , I get the right EigenValues but if I reduce it, I get
different answers (when I try r=4).

 Q2. When the program terminates, I see it outputs something like:

 EigenVector 1 with Eigen Value xyz .

  Please tell me what the output in the file rawEigenvectors contains ?? It
has a format like:

 0 {0:value1,1:value2,2:value3....}
 1   {0:value1,1:value2,2:value3....}
 2  {0:value1,1:value2,2:value3....}

Q3: I want to generate SVD Decomposition values using Mahout and another
C-library and compare the accuracy of Mahout results ? What kind of steps
should I take ?

   My task is to apply SVD on a Term-Document matrix which is of size (6080
Doc * 130000 terms).