You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by jk...@apache.org on 2016/01/06 00:33:32 UTC

spark git commit: [SPARK-12041][ML][PYSPARK] Add columnSimilarities to IndexedRowMatrix

Repository: spark
Updated Branches:
  refs/heads/master ff8997554 -> 1537e5560


[SPARK-12041][ML][PYSPARK] Add columnSimilarities to IndexedRowMatrix

Add `columnSimilarities` to IndexedRowMatrix for PySpark spark.mllib.linalg.

Author: Kai Jiang <ji...@gmail.com>

Closes #10158 from vectorijk/spark-12041.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1537e556
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1537e556
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1537e556

Branch: refs/heads/master
Commit: 1537e55604cafafa49a8b7f3ce915f9745392bc0
Parents: ff89975
Author: Kai Jiang <ji...@gmail.com>
Authored: Tue Jan 5 15:33:27 2016 -0800
Committer: Joseph K. Bradley <jo...@databricks.com>
Committed: Tue Jan 5 15:33:27 2016 -0800

----------------------------------------------------------------------
 python/pyspark/mllib/linalg/distributed.py | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/1537e556/python/pyspark/mllib/linalg/distributed.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/linalg/distributed.py b/python/pyspark/mllib/linalg/distributed.py
index 0e76050..e1f0221 100644
--- a/python/pyspark/mllib/linalg/distributed.py
+++ b/python/pyspark/mllib/linalg/distributed.py
@@ -297,6 +297,20 @@ class IndexedRowMatrix(DistributedMatrix):
         """
         return self._java_matrix_wrapper.call("numCols")
 
+    def columnSimilarities(self):
+        """
+        Compute all cosine similarities between columns.
+
+        >>> rows = sc.parallelize([IndexedRow(0, [1, 2, 3]),
+        ...                        IndexedRow(6, [4, 5, 6])])
+        >>> mat = IndexedRowMatrix(rows)
+        >>> cs = mat.columnSimilarities()
+        >>> print(cs.numCols())
+        3
+        """
+        java_coordinate_matrix = self._java_matrix_wrapper.call("columnSimilarities")
+        return CoordinateMatrix(java_coordinate_matrix)
+
     def toRowMatrix(self):
         """
         Convert this matrix to a RowMatrix.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org