You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/04/12 01:11:25 UTC

[jira] [Commented] (MADLIB-907) Prediction Metrics

    [ https://issues.apache.org/jira/browse/MADLIB-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236172#comment-15236172 ] 

Frank McQuillan commented on MADLIB-907:
----------------------------------------

1) Seems like a good set of prediction metrics to start with.  If other members of the community would like to add more, they are welcome to create a JIRA for those and work on them.

2) Suggest we do include grouping as an optional param, since it could be very useful.  It means an output table is the way to go.  Without grouping, an output table with a single value is not ideal but OK, since consistency of output format is useful.


> Prediction Metrics
> ------------------
>
>                 Key: MADLIB-907
>                 URL: https://issues.apache.org/jira/browse/MADLIB-907
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Utilities
>            Reporter: Frank McQuillan
>            Assignee: Orhan Kislal
>             Fix For: v1.9.1
>
>         Attachments: interface_v1.sql, interface_v3.sql
>
>
> Story
> As a data scientist, I want to compute prediction metrics on my data, so that I can gauge model accuracy based on predicted values vs. actual values.
> 1)  The PDL Tools modules "Prediction Metrics" [1] is an example of what could be ported to MADlib.  Source code is located at [2].
> 2) Here is functionality from PDL tools to use as a starting point:
>  	mf_mae
>  	Mean Absolute Error. 
>  
>  	mf_mape
>  	Mean Absolute Percentage Error. 
>  
>  	mf_mpe
>  	Mean Percentage Error. 
>  
>  	mf_rmse
>  	Root Mean Square Error. 
>  
>  	mf_r2
>  	R-squared. 
>  
>  	mf_adjusted_r2
>  	Adjusted R-squared. 
>  
>  	mf_binary_classifier
>  	Metrics for binary classification. 
>  
>  	mf_auc
>  	Area under the ROC curve (in binary classification). 
>  
>  	mf_confusion_matrix
>  	Confusion matrix for a multi-class classifier. 
> References
> [1]  PDL Tools Prediction Metrics module
> http://pivotalsoftware.github.io/PDLTools/group__grp__prediction__metrics.html
> [2] PDL tools source code
> https://github.com/pivotalsoftware/PDLTools



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)