You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "devl (JIRA)" <ji...@apache.org> on 2012/07/10 21:00:41 UTC

[jira] [Created] (MATH-814) Kendalls Tau Implementation

devl created MATH-814:
-------------------------

             Summary: Kendalls Tau Implementation
                 Key: MATH-814
                 URL: https://issues.apache.org/jira/browse/MATH-814
             Project: Commons Math
          Issue Type: New Feature
    Affects Versions: 4.0
         Environment: All
            Reporter: devl
             Fix For: 4.0


Implement the Kendall's Tau which is a measure of Association/Correlation between ranked ordinal data.

A basic description is available at http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however the test implementation will follow that defined by "Handbook of Parametric and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, ISBN-10: 1439858012 | ISBN-13: 978-1439858011."

The algorithm is proposed as follows. 

Given two rankings or permutations represented by a 2D matrix; columns indicate rankings (e.g. by an individual) and row are observations of each rank. The algorithm is to calculate the total number of concordant pairs of ranks (between columns), discordant pairs of ranks  (between columns) and calculate the Tau defined as

tau= (Number of concordant - number of discordant)/(n(n-1)/2)
 where n(n-1)/2 is the total number of possible pairs of ranks.

The method will then output the tau value between -1 and 1 where 1 signifies a "perfect" correlation between the two ranked lists. 

Where ties exist within a ranking it is marked as neither concordant nor discordant in the calculation. An optional merge sort can be used to speed up the implementation. Details are in the wiki page.

Although this implementation is not particularly complex it would be useful to have it in a consistent format in the commons math package in addition to existing correlation tests. Kendall's Tau is used effectively in comparing ranks for products, rankings from search engines or measurements from engineering equipment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-814) Kendalls Tau Implementation

Posted by "devl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410709#comment-13410709 ] 

devl commented on MATH-814:
---------------------------

Initial feedback from Phil Steitz

I think a Kendal's Tau implementation would make a great addition to
the correlation package (o.a.c.math3.stat.correlation).  Here is how
you can get started:

0) Get yourself set up to build commons math and run the unit
tests.  If you are familiar with maven, this should not be too
hard.  If you have any questions or run into problems checking out
the sources, building locally, etc., don't hesitate to ask.
1) Look at the Spearman's implementation and the ranking classes in
the stat.ranking package.  That might give you some ideas on how to
implement Kendal's consistently.
2) Open a JIRA ticket with the info above and start attaching
patches implementing the new implementation class and associated
test class.  Run "mvn site" or checkstyle standalone to make sure
your contributed code follows the style guidelines we use.
3) Be patient but persistent and we will get Kendall's Tau into
commons math :)
                
> Kendalls Tau Implementation
> ---------------------------
>
>                 Key: MATH-814
>                 URL: https://issues.apache.org/jira/browse/MATH-814
>             Project: Commons Math
>          Issue Type: New Feature
>    Affects Versions: 4.0
>         Environment: All
>            Reporter: devl
>              Labels: correlation, rank
>             Fix For: 4.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> Implement the Kendall's Tau which is a measure of Association/Correlation between ranked ordinal data.
> A basic description is available at http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however the test implementation will follow that defined by "Handbook of Parametric and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, ISBN-10: 1439858012 | ISBN-13: 978-1439858011."
> The algorithm is proposed as follows. 
> Given two rankings or permutations represented by a 2D matrix; columns indicate rankings (e.g. by an individual) and row are observations of each rank. The algorithm is to calculate the total number of concordant pairs of ranks (between columns), discordant pairs of ranks  (between columns) and calculate the Tau defined as
> tau= (Number of concordant - number of discordant)/(n(n-1)/2)
>  where n(n-1)/2 is the total number of possible pairs of ranks.
> The method will then output the tau value between -1 and 1 where 1 signifies a "perfect" correlation between the two ranked lists. 
> Where ties exist within a ranking it is marked as neither concordant nor discordant in the calculation. An optional merge sort can be used to speed up the implementation. Details are in the wiki page.
> Although this implementation is not particularly complex it would be useful to have it in a consistent format in the commons math package in addition to existing correlation tests. Kendall's Tau is used effectively in comparing ranks for products, rankings from search engines or measurements from engineering equipment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira