You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by me...@apache.org on 2014/07/03 20:55:10 UTC

git commit: SPARK-1675. Make clear whether computePrincipalComponents requires centered data

Repository: spark
Updated Branches:
  refs/heads/master c48053773 -> 2b36344f5


SPARK-1675. Make clear whether computePrincipalComponents requires centered data

Just closing out this small JIRA, resolving with a comment change.

Author: Sean Owen <so...@cloudera.com>

Closes #1171 from srowen/SPARK-1675 and squashes the following commits:

45ee9b7 [Sean Owen] Add simple note that data need not be centered for computePrincipalComponents


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2b36344f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2b36344f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2b36344f

Branch: refs/heads/master
Commit: 2b36344f588d4e7357ce9921dc656e2389ba1dea
Parents: c480537
Author: Sean Owen <so...@cloudera.com>
Authored: Thu Jul 3 11:54:51 2014 -0700
Committer: Xiangrui Meng <me...@databricks.com>
Committed: Thu Jul 3 11:54:51 2014 -0700

----------------------------------------------------------------------
 .../org/apache/spark/mllib/linalg/distributed/RowMatrix.scala      | 2 ++
 1 file changed, 2 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/2b36344f/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
index 1a0073c..695e03b 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
@@ -347,6 +347,8 @@ class RowMatrix(
    * The principal components are stored a local matrix of size n-by-k.
    * Each column corresponds for one principal component,
    * and the columns are in descending order of component variance.
+   * The row data do not need to be "centered" first; it is not necessary for
+   * the mean of each column to be 0.
    *
    * @param k number of top principal components.
    * @return a matrix of size n-by-k, whose columns are principal components