You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wayne Zhang (JIRA)" <ji...@apache.org> on 2017/01/07 05:18:58 UTC
[jira] [Closed] (SPARK-18929) Add Tweedie distribution in GLM
[ https://issues.apache.org/jira/browse/SPARK-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wayne Zhang closed SPARK-18929.
-------------------------------
Resolution: Unresolved
> Add Tweedie distribution in GLM
> -------------------------------
>
> Key: SPARK-18929
> URL: https://issues.apache.org/jira/browse/SPARK-18929
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Reporter: Wayne Zhang
> Assignee: Wayne Zhang
> Labels: features
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> I propose to add the full Tweedie family into the GeneralizedLinearRegression model. The Tweedie family is characterized by a power variance function. Currently supported distributions such as Gaussian, Poisson and Gamma families are a special case of the [Tweedie|https://en.wikipedia.org/wiki/Tweedie_distribution].
> I propose to add support for the other distributions:
> * compound Poisson: 1 < variancePower < 2. This one is widely used to model zero-inflated continuous distributions.
> * positive stable: variancePower > 2 and variancePower != 3. Used to model extreme values.
> * inverse Gaussian: variancePower = 3.
> The Tweedie family is supported in most statistical packages such as R (statmod), SAS, h2o etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org