You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wayne Zhang (JIRA)" <ji...@apache.org> on 2017/01/07 05:18:58 UTC

[jira] [Closed] (SPARK-18929) Add Tweedie distribution in GLM

     [ https://issues.apache.org/jira/browse/SPARK-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wayne Zhang closed SPARK-18929.
-------------------------------
    Resolution: Unresolved

> Add Tweedie distribution in GLM
> -------------------------------
>
>                 Key: SPARK-18929
>                 URL: https://issues.apache.org/jira/browse/SPARK-18929
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Wayne Zhang
>            Assignee: Wayne Zhang
>              Labels: features
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> I propose to add the full Tweedie family into the GeneralizedLinearRegression model. The Tweedie family is characterized by a power variance function. Currently supported distributions such as Gaussian,  Poisson and Gamma families are a special case of the [Tweedie|https://en.wikipedia.org/wiki/Tweedie_distribution]. 
> I propose to add support for the other distributions:
> * compound Poisson: 1 < variancePower < 2. This one is widely used to model zero-inflated continuous distributions. 
> * positive stable: variancePower > 2 and variancePower != 3. Used to model extreme values.
> * inverse Gaussian: variancePower = 3.
>  The Tweedie family is supported in most statistical packages such as R (statmod), SAS, h2o etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org