You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (Jira)" <ji...@apache.org> on 2019/09/16 18:32:00 UTC

[jira] [Resolved] (SPARK-19184) Improve numerical stability for method tallSkinnyQR.

     [ https://issues.apache.org/jira/browse/SPARK-19184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-19184.
-------------------------------
    Resolution: Won't Fix

> Improve numerical stability for method tallSkinnyQR.
> ----------------------------------------------------
>
>                 Key: SPARK-19184
>                 URL: https://issues.apache.org/jira/browse/SPARK-19184
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 2.2.0
>            Reporter: Huamin Li
>            Priority: Minor
>              Labels: None
>
> In method tallSkinnyQR, the final Q is calculated by A * inv(R) ([Github Link|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala#L562]). When the upper triangular matrix R is ill-conditioned, computing the inverse of R can result in catastrophic cancellation. Instead, we should consider using a forward solve for solving Q such that Q * R = A.
> I first create a 4 by 4 RowMatrix A = (1,1,1,1;0,1E-5,0,0;0,0,1E-10,1;0,0,0,1E-14), and then I apply method tallSkinnyQR to A to find RowMatrix Q and Matrix R such that A = Q*R. In this case, A is ill-conditioned and so is R.
> See codes in Spark Shell:
> {code:none}
> import org.apache.spark.mllib.linalg.{Matrices, Vector, Vectors}
> import org.apache.spark.mllib.linalg.distributed.RowMatrix
> // Create RowMatrix A.
> val mat = Seq(Vectors.dense(1,1,1,1), Vectors.dense(0, 1E-5, 1,1), Vectors.dense(0,0,1E-10,1), Vectors.dense(0,0,0,1E-14))
> val denseMat = new RowMatrix(sc.parallelize(mat, 2))
> // Apply tallSkinnyQR to A.
> val result = denseMat.tallSkinnyQR(true)
> // Print the calculated Q and R.
> result.Q.rows.collect.foreach(println)
> result.R
> // Calculate Q*R. Ideally, this should be close to A.
> val reconstruct = result.Q.multiply(result.R)
> reconstruct.rows.collect.foreach(println)
> // Calculate Q'*Q. Ideally, this should be close to the identity matrix.
> result.Q.computeGramianMatrix()
> System.exit(0)
> {code}
> it will output the following results:
> {code:none}
> scala> result.Q.rows.collect.foreach(println)
> [1.0,0.0,0.0,1.5416524685312E13]
> [0.0,0.9999999999999999,0.0,8011776.0]
> [0.0,0.0,1.0,0.0]
> [0.0,0.0,0.0,1.0]
> scala> result.R
> 1.0  1.0     1.0      1.0
> 0.0  1.0E-5  1.0      1.0
> 0.0  0.0     1.0E-10  1.0
> 0.0  0.0     0.0      1.0E-14
> scala> reconstruct.rows.collect.foreach(println)
> [1.0,1.0,1.0,1.15416524685312]
> [0.0,9.999999999999999E-6,0.9999999999999999,1.00000008011776]
> [0.0,0.0,1.0E-10,1.0]
> [0.0,0.0,0.0,1.0E-14]
> scala> result.Q.computeGramianMatrix()
> 1.0                 0.0                 0.0  1.5416524685312E13
> 0.0                 0.9999999999999998  0.0  8011775.999999999
> 0.0                 0.0                 1.0  0.0
> 1.5416524685312E13  8011775.999999999   0.0  2.3766923337289844E26
> {code}
> With forward solve for solving Q such that Q * R = A rather than computing the inverse of R, it will output the following results instead:
> {code:none}
> scala> result.Q.rows.collect.foreach(println)
> [1.0,0.0,0.0,0.0]
> [0.0,1.0,0.0,0.0]
> [0.0,0.0,1.0,0.0]
> [0.0,0.0,0.0,1.0]
> scala> result.R
> 1.0  1.0     1.0      1.0
> 0.0  1.0E-5  1.0      1.0
> 0.0  0.0     1.0E-10  1.0
> 0.0  0.0     0.0      1.0E-14
> scala> reconstruct.rows.collect.foreach(println)
> [1.0,1.0,1.0,1.0]
> [0.0,1.0E-5,1.0,1.0]
> [0.0,0.0,1.0E-10,1.0]
> [0.0,0.0,0.0,1.0E-14]
> scala> result.Q.computeGramianMatrix()
> 1.0  0.0  0.0  0.0
> 0.0  1.0  0.0  0.0
> 0.0  0.0  1.0  0.0
> 0.0  0.0  0.0  1.0
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org