You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Amar Prakash Pandey (Jira)" <ji...@apache.org> on 2021/06/01 17:58:00 UTC

[jira] [Commented] (MATH-1453) Mann-Whitney U Test returns maximum of U1 and U2

    [ https://issues.apache.org/jira/browse/MATH-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355272#comment-17355272 ] 

Amar Prakash Pandey commented on MATH-1453:
-------------------------------------------

[~erans] you have closed the issue.

> Will you look into porting the improvements referred to in an earlier comment? 

Shall I work on the above comment or not?

> Mann-Whitney U Test returns maximum of U1 and U2
> ------------------------------------------------
>
>                 Key: MATH-1453
>                 URL: https://issues.apache.org/jira/browse/MATH-1453
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.6.1
>            Reporter: Nikos Katsipoulakis
>            Assignee: Amar Prakash Pandey
>            Priority: Critical
>             Fix For: 4.0
>
>         Attachments: 1453.diff
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, I need to use Mann-Whitney U Test and I figured out that Apache Commons Math has it implemented. After consulting the [Wiki|https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test] presented in the Java Doc, it indicates that the U statistic of this test is the minimum among U1 and U2. However, when I look into Apache Commons Math {{MannWhitneyUTest.mannWhitneyU()}} method, it returns the maximum of U1 and U2. In fact, the code of this method is the following: 
>  
> {code:java}
> public double mannWhitneyU(double[] x, double[] y) throws NullArgumentException, NoDataException {
>   this.ensureDataConformance(x, y);
>   double[] z = this.concatenateSamples(x, y);
>   double[] ranks = this.naturalRanking.rank(z);
>   double sumRankX = 0.0D;
>   for(int i = 0; i < x.length; ++i) {
>     sumRankX += ranks[i];
>   }
>   double U1 = sumRankX - (double)((long)x.length * (long)(x.length + 1) / 2L);
>   double U2 = (double)((long)x.length * (long)y.length) - U1;
>   return FastMath.max(U1, U2);
> }
> {code}
> Also, in the Java Doc it is stated that the maximum value of U1 and U2 is returned.
>  
> My question is why Apache Commons returns the maximum of those two values, whereas all other sources I found online indicate returning the minimum? If this is not wrong, then shouldn't the Java Doc be updated to include a source that justifies that the maximum U should be returned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)