You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/01/21 07:58:35 UTC

[jira] [Issue Comment Deleted] (SPARK-1262) Numerical drift in computation of matrix inverse leads to invalid results in ALS

     [ https://issues.apache.org/jira/browse/SPARK-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-1262:
-----------------------------
    Comment: was deleted

(was: That makes sense. There may still be an interesting point here. What would be interested to know is whether the size of |Ax-B| is consistently larger or smaller for one solver or another. If you find the QR decomposition has smaller differences and gives anecdotally better results, that would be a compelling reason to consider the different decomposition. There was another issue regarding solvePositive and mysteriously near-singular A that might be related.)

> Numerical drift in computation of matrix inverse leads to invalid results in ALS
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-1262
>                 URL: https://issues.apache.org/jira/browse/SPARK-1262
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: Michael Allman
>
> In what follows I cannot offer an expert analysis and remedy, however I will describe the problem I'm seeing and the strategy I've used to mitigate it.
> The ALS {{updateBlock()}} method includes a call to {{Solve.solvePositive()}} from JBlas. Generally speaking, when we call {{solvePositive(A, B)}} for symmetric, positive definite {{A}}, this method computes {{x}} from the matrix equation {{Ax = B}}. Or, in other words, it computes {{A}}^-1^{{B}}. As mentioned, one of the preconditions on A is that it be symmetric. In ALS, we call this method on {{fullXtX}} or {{fullXtX.add(YtY.value.get)}}, both of which should be symmetric. However, for implicit ALS and rank > 1, some kind of imprecision in the computation of this inverse tends to make successive values of {{fullXtX.add(YtY.value.get)}} less and less symmetric. From my experience this can and does alter the model produced by ALS significantly, leading to very different and counterintuitive user-item recommendations compared to a solution in which this asymmetry does not exist.
> An approach I've seen taken against this problem from the Oryx codebase is to cast each value in the vector returned from {{Solve.solvePositive()}} to a float. That has worked for me.
> I've also tried alternative implementations of a linear system solver. The solver that Oryx uses, {{RRQRDecomposition}} from commons math v3, seems to lead to better solutions, but still drifts unless the results are cast to floats.
> The Colt numerical computing libraries include a solver called {{QRDecomposition}}, which may be superior to the one used in JBlas. However, I haven't tested it.
> I'm working on a unit test to exercise this bug. I discovered it by "tracing" the behavior of {{ALS.scala}} with various logging statements inserted into the code.
> If nothing else, add a line before the calls to {{Solve.solvePositive()}} in {{updateBlock()}} to validate that {{fullXtX}} or {{fullXtX.add(YtY.value.get)}} are symmetric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org