You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ganeshkrishnan <ma...@ganeshkrishnan.com> on 2016/11/07 23:51:21 UTC

VectorUDT and ml.Vector for SVD

I am trying to run a SVD on a dataframe and I have used ml TF-IDF which has
created a dataframe.
Now for Singular Value Decomposition I am trying to use RowMatrix which
takes in RDD with mllib.Vector so I have to convert this Dataframe with what
I assumed was ml.Vector

However the conversion

val convertedTermDocMatrix =
MLUtils.convertMatrixColumnsFromML(termDocMatrix,"features")

fails with

java.lang.IllegalArgumentException: requirement failed: Column features must
be new Matrix type to be converted to old type but got
org.apache.spark.ml.linalg.VectorUDT


So the question is: How do I perform SVD on a DataFrame? I assume all the
functionalities of mllib has not be ported to ml.


I tried to convert my entire project to use RDD but computeSVD on RowMatrix
is throwing up out of Memory errors and anyway I would like to stick with
DataFrame.

Our text corpus is around 55 Gb of text data.



Ganesh



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/VectorUDT-and-ml-Vector-for-SVD-tp28038.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org