You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Phillip Henry <lo...@gmail.com> on 2022/01/22 10:29:03 UTC

Log likelhood in GeneralizedLinearRegression

Hi,

As far as I know, there is no function to generate the log likelihood from
a GeneralizedLinearRegression model. Are there any plans to implement one?

I've coded my own in PySpark and in testing it agrees with the values we
get from the Python library StatsModels to one part in a million. It's
kinda yucky code as it relies on some inefficient UDFs but I could port it
to Scala.

Would anybody be interested in me raising a PR and coding an efficient
Scala implementation that can be called from PySpark?

Regards,

Phillip

Re: Log likelhood in GeneralizedLinearRegression

Posted by Sean Owen <sr...@gmail.com>.
This exists in the evaluator MulticlassClassificationEvaluator instead
(which can be used for binary), does that work?

On Sat, Jan 22, 2022 at 4:36 AM Phillip Henry <lo...@gmail.com>
wrote:

> Hi,
>
> As far as I know, there is no function to generate the log likelihood from
> a GeneralizedLinearRegression model. Are there any plans to implement one?
>
> I've coded my own in PySpark and in testing it agrees with the values we
> get from the Python library StatsModels to one part in a million. It's
> kinda yucky code as it relies on some inefficient UDFs but I could port it
> to Scala.
>
> Would anybody be interested in me raising a PR and coding an efficient
> Scala implementation that can be called from PySpark?
>
> Regards,
>
> Phillip
>
>