You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Phil Steitz (JIRA)" <ji...@apache.org> on 2012/09/08 18:00:08 UTC

[jira] [Commented] (MATH-857) Include a VIF and TOLERANCE check for a 2 dimensional double array, to determine variables that cause multi-colinearity issues and should be excluded from the models

    [ https://issues.apache.org/jira/browse/MATH-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451355#comment-13451355 ] 

Phil Steitz commented on MATH-857:
----------------------------------

This is a great start.  I would say rename the class Multicollinearity and put it in regression. Per Gilles' comments on the mailing list, we also need some tests.  Validation of test cases against R or some other package is also desirable.  There is an R test framework in /src/test/R that can be used to validate test cases against R.  Ask on the ML or via private email if you need help getting set up to generate checkstyle reports, etc.
                
> Include a VIF and TOLERANCE check for a 2 dimensional double array, to determine variables that cause multi-colinearity issues and should be excluded from the models
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MATH-857
>                 URL: https://issues.apache.org/jira/browse/MATH-857
>             Project: Commons Math
>          Issue Type: New Feature
>    Affects Versions: 3.0
>         Environment: can apply to all operating systems
>            Reporter: Marios Michaelidis
>            Priority: Minor
>              Labels: build, test
>             Fix For: 3.1
>
>         Attachments: VIF_Tolerance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Multicollinearity is a statistical phenomenon in which two or more predictor variables in any multiple regression model are highly correlated. Tolerance and VIF are checks that allows to avoid optimization failes due to "inability to converge". Most of the times, the major packages (SAS, SPSS etc), have a check prior to running the model and they exclude variables that might cause these kind of problems. It is quite a useful tool to be in common maths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira