You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by ak...@apache.org on 2017/02/03 23:10:19 UTC

svn commit: r1781616 - /mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext

Author: akm
Date: Fri Feb  3 23:10:19 2017
New Revision: 1781616

URL: http://svn.apache.org/viewvc?rev=1781616&view=rev
Log:
Formatting

Modified:
    mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext

Modified: mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext
URL: http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext?rev=1781616&r1=1781615&r2=1781616&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext (original)
+++ mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext Fri Feb  3 23:10:19 2017
@@ -7,27 +7,27 @@ Mahout has a distributed implementation
 
 ## Algorithm
 
-Given an *m* `\(\times\)` *n* matrix `\(\mathbf{A}\)`, a target rank *k*, and an oversampling parameter *p*, this procedure computes a *k*-rank PCA by finding the unknowns in `\(\mathbf{A}−1\mu \ge U\SigmaV\)`:
-(1) Create seed for random *n* `\(\times\)` *(k+p)* matrix `\(\Omega\)`.
-(2) `\(s_\Omega \leftarrow \Omega^\top \mu\)`.
-(3) `\(\mathbf{Y_0} \leftarrow A\Omega − 1(s_\Omega)^\top, Y \in \mathbb{R}^(m\times(k+p))\)`.
-(4) Column-orthonormalize `\(\mathbf{Y_0} \rightarrow \mathbf{Q}\)` by computing thin decomposition `\(\mathbf{Y_0} = \mathbf{QR}\)`. Also, `\(\mathbf{Q}\in\mathbb{R}^(m\times(k+p)), \mathbf{R}\in\mathbb{R}^((k+p)\times(k+p))\)`.
-(5) `\(s_Q \leftarrow Q^\top 1\)`.
-(6) `\(\mathbf{B_0} \leftarrow Q^\top A: B \in \mathbb{R}^((k+p)\times n)\)`.
-(7) `\(s_B \leftarrow (B_0)^\top \mu\)`.
-(8) For *i* in 1..*q* repeat (power iterations):
-  (a) For *j* in 1..*n* `\(apply(B_(i−1))_(∗j) \leftarrow (B_(i−1))_(∗j)−\mu_j s_Q\)`.
-  (b) `\(\mathbf{Y_i) \leftarrow \mathbf{(AB_(i−1)^\top)−1(s_B−\mu^\top \mu s_Q^\top)}\)`.
-  (c) Column-orthonormalize `\(\mathbf{Y_i} \rightarrow \mathbf{Q}\)` by computing thin decomposition `\(\mathbf{Y_i = QR}\)`.
-  (d) `\(\mathbf{s_Q \leftarrow Q^\top 1}\)`.
-  (e) `\(\mathbf{B_i \leftarrow Q^\top A}\)`.
-  (f) `\(\mathbf{s_B \leftarrow (B_i)^\top \mu}\)`.
-(9) Let `\(\mathbf{C \triangleq s_Q (s_B)^\top}\)`. `\(\mathbf{M \leftarrow B_q (B_q)^\top − C − C^\top + \mu^\top \mu s_Q (s_Q)^\top}\)`.
-(10) Compute an eigensolution of the small symmetric `\(\mathbf{M = \hat{U} \Lambda \hat{U}^\top: M \in \mathbb{R}^((k+p)\times(k+p))}\)`.
-(11) The singular values `\(\Sigma = \Lambda^(\circ 0.5)\)`, or, in other words, `\(\mathbf{\sigma_i= \sqrt{\lambda_i}}\)`.
-(12) If needed, compute `\(\mathbf{U = Q\hat{U}}\)`.
-(13) If needed, compute `\(\mathbf{V = B^\top \hat{U} \Sigma^(−1)}\)`. Another way is `\(\mathbf{V = A^\top U\Sigma^(−1)}\)1.
-(14) If needed, items converted to the PCA space can be computed as `\(\mathbf{U\Sigma}\)`.
+Given an *m* `\(\times\)` *n* matrix `\(\mathbf{A}\)`, a target rank *k*, and an oversampling parameter *p*, this procedure computes a *k*-rank PCA by finding the unknowns in `\(\mathbf{A−1\mu^\top \ge U\Sigma V}\)`:
+1. Create seed for random *n* `\(\times\)` *(k+p)* matrix `\(\Omega\)`.
+2. `\(s_\Omega \leftarrow \Omega^\top \mu\)`.
+3. `\(\mathbf{Y_0} \leftarrow A\Omega − 1(s_\Omega)^\top, Y \in \mathbb{R}^(m\times(k+p))\)`.
+4. Column-orthonormalize `\(\mathbf{Y_0} \rightarrow \mathbf{Q}\)` by computing thin decomposition `\(\mathbf{Y_0} = \mathbf{QR}\)`. Also, `\(\mathbf{Q}\in\mathbb{R}^(m\times(k+p)), \mathbf{R}\in\mathbb{R}^((k+p)\times(k+p))\)`.
+5. `\(s_Q \leftarrow Q^\top 1\)`.
+6. `\(\mathbf{B_0} \leftarrow Q^\top A: B \in \mathbb{R}^((k+p)\times n)\)`.
+7. `\(s_B \leftarrow (B_0)^\top \mu\)`.
+8. For *i* in 1..*q* repeat (power iterations):
+    - For *j* in 1..*n* `\(apply(B_(i−1))_(∗j) \leftarrow (B_(i−1))_(∗j)−\mu_j s_Q\)`.
+    - `\(\mathbf{Y_i) \leftarrow \mathbf{(AB_(i−1)^\top)−1(s_B−\mu^\top \mu s_Q^\top)}\)`.
+    - Column-orthonormalize `\(\mathbf{Y_i} \rightarrow \mathbf{Q}\)` by computing thin decomposition `\(\mathbf{Y_i = QR}\)`.
+    - `\(\mathbf{s_Q \leftarrow Q^\top 1}\)`.
+    - `\(\mathbf{B_i \leftarrow Q^\top A}\)`.
+    - `\(\mathbf{s_B \leftarrow (B_i)^\top \mu}\)`.
+9. Let `\(\mathbf{C \triangleq s_Q (s_B)^\top}\)`. `\(\mathbf{M \leftarrow B_q (B_q)^\top − C − C^\top + \mu^\top \mu s_Q (s_Q)^\top}\)`.
+10. Compute an eigensolution of the small symmetric `\(\mathbf{M = \hat{U} \Lambda \hat{U}^\top: M \in \mathbb{R}^((k+p)\times(k+p))}\)`.
+11. The singular values `\(\Sigma = \Lambda^(\circ 0.5)\)`, or, in other words, `\(\mathbf{\sigma_i= \sqrt{\lambda_i}}\)`.
+12. If needed, compute `\(\mathbf{U = Q\hat{U}}\)`.
+13. If needed, compute `\(\mathbf{V = B^\top \hat{U} \Sigma^(−1)}\)`. Another way is `\(\mathbf{V = A^\top U\Sigma^(−1)}\)1.
+14. If needed, items converted to the PCA space can be computed as `\(\mathbf{U\Sigma}\)`.
 
 ## Implementation
 
@@ -165,4 +165,4 @@ Note the parameter is optional and its d
  
 ## References
 
-[1]: ["Apache Mahout: Beyond MapReduce; Distributed Algorithm Design", Lyubimov and Palumbo](https://www.amazon.com/Apache-Mahout-MapReduce-Dmitriy-Lyubimov/dp/1523775785)
+[1]: Lyubimov and Palumbo, ["Apache Mahout: Beyond MapReduce; Distributed Algorithm Design"](https://www.amazon.com/Apache-Mahout-MapReduce-Dmitriy-Lyubimov/dp/1523775785)