You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Donni Khan <pr...@googlemail.com> on 2017/10/27 07:20:28 UTC

cosine similarity between rows

I have spark job to compute the similarity between text documents:

RowMatrix rowMatrix = new RowMatrix(vectorsRDD.rdd());
CoordinateMatrix
rowsimilarity=rowMatrix.columnSimilarities(0.5);JavaRDD<MatrixEntry>
entries = rowsimilarity.entries().toJavaRDD();
List<MatrixEntry> list = entries.collect();
for(MatrixEntry s : list) System.out.println(s);

the MatrixEntry(i, j, value) represents the similarity between
columns(let's say the features of documents).But how can I show the
similarity between rows? suppose I have five documents Doc1,.... Doc5, I
would like to show the similarity between all those documnts. How do I get
that?
 any help?