You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Anders Conbere (JIRA)" <ji...@apache.org> on 2014/07/27 23:27:38 UTC
[jira] [Created] (MATH-1140) Incorrect result from
MannWhitneyUTest#mannWhitneyUTest with large datasets
Anders Conbere created MATH-1140:
------------------------------------
Summary: Incorrect result from MannWhitneyUTest#mannWhitneyUTest with large datasets
Key: MATH-1140
URL: https://issues.apache.org/jira/browse/MATH-1140
Project: Commons Math
Issue Type: Bug
Affects Versions: 3.3
Reporter: Anders Conbere
Priority: Minor
On large datasets MannWhitneyUTest#mannWhitneyUTest returns the double value 0.0 instead of the correct p-value. I suspect this is an overflow but haven't been able to trace it down yet.
I'm afraid I'm not very good at java, but I'm including a link to a public repository where you can reproduce the issue, unfortunately my implementation is written in clojure.
https://github.com/aconbere/apache-commons-mann-whitney-bug
The summary is that by calling MannWhitneyUTest#mannWhitneyUTest with two randomly generated arrays (50k elements with a max value of 300) I can reliably reproduce the result 0.0. By reducing that to something more modest like 2k I get correct p-value calculations.
--
This message was sent by Atlassian JIRA
(v6.2#6252)