You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Seth Hendrickson (JIRA)" <ji...@apache.org> on 2017/07/13 17:39:00 UTC

[jira] [Commented] (SPARK-21405) Add LBFGS solver for GeneralizedLinearRegression

    [ https://issues.apache.org/jira/browse/SPARK-21405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086071#comment-16086071 ] 

Seth Hendrickson commented on SPARK-21405:
------------------------------------------

cc [~yanboliang] [~actuaryzhang]

I'm happy to work on it, but wanted to get your opinions here. Thoughts?

> Add LBFGS solver for GeneralizedLinearRegression
> ------------------------------------------------
>
>                 Key: SPARK-21405
>                 URL: https://issues.apache.org/jira/browse/SPARK-21405
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.3.0
>            Reporter: Seth Hendrickson
>
> GeneralizedLinearRegression in Spark ML currently only allows 4096 features because it uses IRLS, and hence WLS, as an optimizer which relies on collecting the covariance matrix to the driver. GLMs can also be fit by simple gradient based methods like LBFGS.
> The new API from [SPARK-19762|https://issues.apache.org/jira/browse/SPARK-19762] makes this easy to add. I've already prototyped it, and it works pretty well. This change would allow an arbitrary number of features (up to what can fit on a single node) as in Linear/Logistic regression.
> For reference, other GLM packages also support this - e.g. statsmodels, H2O.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org