You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Younus (JIRA)" <ji...@apache.org> on 2016/03/09 20:05:41 UTC

[jira] [Created] (SPARK-13777) Weighted Leaset Squares fails when there are features with identical values.

Imran Younus created SPARK-13777:
------------------------------------

             Summary: Weighted Leaset Squares fails when there are features with identical values.
                 Key: SPARK-13777
                 URL: https://issues.apache.org/jira/browse/SPARK-13777
             Project: Spark
          Issue Type: Bug
          Components: ML
            Reporter: Imran Younus
            Priority: Minor


"normal" solver in LinearRegression uses Cholesky decomposition to calculate the coefficients. If the data has features with identical values (zero variance), then (A^T A) matrix is not positive definite any more and the Cholesky decomposition fails.

For the same case, "l-bfgs" solver sets the coefficients of these constant features to zero and produces valid coefficients for the rest of the features. This behaviour is consistent with glmnet in R. "normal" solver should also do the same.







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org