You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Mikkel Meyer Andersen <mi...@mikl.dk> on 2010/11/07 16:10:31 UTC

Re: Komogorov distribution WASF Re: [jira] Commented: (MATH-431) New tests: Wilcoxon signed-rank test and Mann-Whitney U

2010/11/7 Phil Steitz <ph...@gmail.com>:
> Switching to the right list...
>
> -
>>>>
>>>> What we need there is a good algorithm for approximating the KS
>>>> distribution.  I have been corresponding with the author of a very good
>>>> one
>>>> with a Java implementation but have thus far failed in getting consent
>>>> to
>>>> release under ASL.  So at this point, I am looking for an alternative
>>>> good
>>>> algorithm to implement.  All suggestions / unencumbered patches welcome!
>>>>
>>>> See comments on the MATH-431 for other questions.
>>>>
>>> Just to be sure of what you mean:
>>> Do you want to have a two-sample Kolmogorov-Smirnov test for equality
>>> of distributions in addition to the Mann-Whitney? Or do you need the
>>> Kolmogorov-Smirnov distribution (as stated for example at
>>>
>>>
>>> http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov_distribution
>>> ) in regards to the MATH-428? Sorry, but I'm at bit confused :-).
>>
>> The goal is to implement the KS test for equality of distributions (or
>> homogeneity against a reference distribution).  To do that we need at
>> least
>> critical values of the Kolmogorov distribution.  The natural way for us to
>> do that would be to implement the full distribution which would be nice to
>> have in the distributions package.
>>
>> Phil
>
> Have you read "Evaluating Kolmogorov’s Distribution" by Marsaglia et
> al. available on http://www.jstatsoft.org/v08/i18/paper ? And do you
> think their approach would be the way to go?
>
> I am not sure it is best.  See the comments here:
> http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf
>
> Phil
Thanks. It looks quite thorough, indeed. Was it the Java
implementation you didn't get a consent to release under ASL?
>
>
>>>>>>
>>>>>> Interesting approach for the exact algorithm for Wilcoxon.  If we stay
>>>>>> with this, we should ack the original author of the algorithm in the
>>>>>> javadoc.  Looks OK to use.
>>>>>
>>>>> Agree - both on the approach and legal part! Does the author need to
>>>>> sign anything but write a mail?
>>>>>>
>>>>>>  Regarding the difference from R, what I usually do in this case is
>>>>>> look
>>>>>> at the R sources to try to explain the difference.  Most likely in
>>>>>> this
>>>>>> case, what is going on is they are using a different estimation
>>>>>> algorithm
>>>>>> for small n or treating ties differently.  The ranking options that we
>>>>>> use
>>>>>> were largely adapted from R, so if that is the problem, it should be
>>>>>> easy to
>>>>>> test.  We need to convince ourselves that ours is better or at least a
>>>>>> legitimate alternative.  I will take a close look this evening, but it
>>>>>> looks
>>>>>> like the algorithm you are using should be exact.  If we can't
>>>>>> reconcile the
>>>>>> difference with R, it would be good to find a way to validate correct
>>>>>> functioning of the algorithm by manufacturing reference data with
>>>>>> known
>>>>>> p.
>>>>>
>>>>> I'll try to investigate the difference, hopefully tomorrow, so that
>>>>> formal tests can be written and included.
>>>>>>
>>>>>>> New tests: Wilcoxon signed-rank test and Mann-Whitney U
>>>>>>> -------------------------------------------------------
>>>>>>>
>>>>>>>                Key: MATH-431
>>>>>>>                URL: https://issues.apache.org/jira/browse/MATH-431
>>>>>>>            Project: Commons Math
>>>>>>>         Issue Type: New Feature
>>>>>>>           Reporter: Mikkel Meyer Andersen
>>>>>>>           Assignee: Mikkel Meyer Andersen
>>>>>>>           Priority: Minor
>>>>>>>        Attachments: MannWhitneyUTest.java, MannWhitneyUTestImpl.java,
>>>>>>> WilcoxonSignedRankTest.java, WilcoxonSignedRankTestImpl.java
>>>>>>>
>>>>>>>  Original Estimate: 4h
>>>>>>>  Remaining Estimate: 4h
>>>>>>>
>>>>>>> Wilcoxon signed-rank test and Mann-Whitney U are commonly used
>>>>>>> non-parametric statistical hypothesis tests (e.g. instead of various
>>>>>>> t-tests
>>>>>>> when normality is not present).
>>>>>>
>>>>>> --
>>>>>> This message is automatically generated by JIRA.
>>>>>> -
>>>>>> You can reply to this email to add a comment to the issue online.
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Komogorov distribution WASF Re: [jira] Commented: (MATH-431) New tests: Wilcoxon signed-rank test and Mann-Whitney U

Posted by Mikkel Meyer Andersen <mi...@mikl.dk>.
2010/11/7 Phil Steitz <ph...@gmail.com>:
> On 11/7/10 10:10 AM, Mikkel Meyer Andersen wrote:
>>
>> 2010/11/7 Phil Steitz<ph...@gmail.com>:
>>>
>>> Switching to the right list...
>>>
>>> -
>>>>>>
>>>>>> What we need there is a good algorithm for approximating the KS
>>>>>> distribution.  I have been corresponding with the author of a very
>>>>>> good
>>>>>> one
>>>>>> with a Java implementation but have thus far failed in getting consent
>>>>>> to
>>>>>> release under ASL.  So at this point, I am looking for an alternative
>>>>>> good
>>>>>> algorithm to implement.  All suggestions / unencumbered patches
>>>>>> welcome!
>>>>>>
>>>>>> See comments on the MATH-431 for other questions.
>>>>>>
>>>>> Just to be sure of what you mean:
>>>>> Do you want to have a two-sample Kolmogorov-Smirnov test for equality
>>>>> of distributions in addition to the Mann-Whitney? Or do you need the
>>>>> Kolmogorov-Smirnov distribution (as stated for example at
>>>>>
>>>>>
>>>>>
>>>>> http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov_distribution
>>>>> ) in regards to the MATH-428? Sorry, but I'm at bit confused :-).
>>>>
>>>> The goal is to implement the KS test for equality of distributions (or
>>>> homogeneity against a reference distribution).  To do that we need at
>>>> least
>>>> critical values of the Kolmogorov distribution.  The natural way for us
>>>> to
>>>> do that would be to implement the full distribution which would be nice
>>>> to
>>>> have in the distributions package.
>>>>
>>>> Phil
>>>
>>> Have you read "Evaluating Kolmogorov’s Distribution" by Marsaglia et
>>> al. available on http://www.jstatsoft.org/v08/i18/paper ? And do you
>>> think their approach would be the way to go?
>>>
>>> I am not sure it is best.  See the comments here:
>>> http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf
>>>
>>> Phil
>>
>> Thanks. It looks quite thorough, indeed. Was it the Java
>> implementation you didn't get a consent to release under ASL?
>>>
> Yes.  I am interested in your and others' opinions on the various algorithms
> reviewed there.  Could be the Marsaglia reference above is adequate for a
> start.
I'll try to make a short comparison of the two methods ASAP.
>
> Phil
>>>
>>>>>>>>
>>>>>>>> Interesting approach for the exact algorithm for Wilcoxon.  If we
>>>>>>>> stay
>>>>>>>> with this, we should ack the original author of the algorithm in the
>>>>>>>> javadoc.  Looks OK to use.
>
>>>>>>>
>>>>>>> Agree - both on the approach and legal part! Does the author need to
>>>>>>> sign anything but write a mail?
>>>>>>>>
>>>>>>>>  Regarding the difference from R, what I usually do in this case is
>>>>>>>> look
>>>>>>>> at the R sources to try to explain the difference.  Most likely in
>>>>>>>> this
>>>>>>>> case, what is going on is they are using a different estimation
>>>>>>>> algorithm
>>>>>>>> for small n or treating ties differently.  The ranking options that
>>>>>>>> we
>>>>>>>> use
>>>>>>>> were largely adapted from R, so if that is the problem, it should be
>>>>>>>> easy to
>>>>>>>> test.  We need to convince ourselves that ours is better or at least
>>>>>>>> a
>>>>>>>> legitimate alternative.  I will take a close look this evening, but
>>>>>>>> it
>>>>>>>> looks
>>>>>>>> like the algorithm you are using should be exact.  If we can't
>>>>>>>> reconcile the
>>>>>>>> difference with R, it would be good to find a way to validate
>>>>>>>> correct
>>>>>>>> functioning of the algorithm by manufacturing reference data with
>>>>>>>> known
>>>>>>>> p.
>>>>>>>
>>>>>>> I'll try to investigate the difference, hopefully tomorrow, so that
>>>>>>> formal tests can be written and included.
>>>>>>>>
>>>>>>>>> New tests: Wilcoxon signed-rank test and Mann-Whitney U
>>>>>>>>> -------------------------------------------------------
>>>>>>>>>
>>>>>>>>>                Key: MATH-431
>>>>>>>>>                URL: https://issues.apache.org/jira/browse/MATH-431
>>>>>>>>>            Project: Commons Math
>>>>>>>>>         Issue Type: New Feature
>>>>>>>>>           Reporter: Mikkel Meyer Andersen
>>>>>>>>>           Assignee: Mikkel Meyer Andersen
>>>>>>>>>           Priority: Minor
>>>>>>>>>        Attachments: MannWhitneyUTest.java,
>>>>>>>>> MannWhitneyUTestImpl.java,
>>>>>>>>> WilcoxonSignedRankTest.java, WilcoxonSignedRankTestImpl.java
>>>>>>>>>
>>>>>>>>>  Original Estimate: 4h
>>>>>>>>>  Remaining Estimate: 4h
>>>>>>>>>
>>>>>>>>> Wilcoxon signed-rank test and Mann-Whitney U are commonly used
>>>>>>>>> non-parametric statistical hypothesis tests (e.g. instead of
>>>>>>>>> various
>>>>>>>>> t-tests
>>>>>>>>> when normality is not present).
>>>>>>>>
>>>>>>>> --
>>>>>>>> This message is automatically generated by JIRA.
>>>>>>>> -
>>>>>>>> You can reply to this email to add a comment to the issue online.
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Komogorov distribution WASF Re: [jira] Commented: (MATH-431) New tests: Wilcoxon signed-rank test and Mann-Whitney U

Posted by Phil Steitz <ph...@gmail.com>.
On 11/7/10 10:10 AM, Mikkel Meyer Andersen wrote:
> 2010/11/7 Phil Steitz<ph...@gmail.com>:
>> Switching to the right list...
>>
>> -
>>>>>
>>>>> What we need there is a good algorithm for approximating the KS
>>>>> distribution.  I have been corresponding with the author of a very good
>>>>> one
>>>>> with a Java implementation but have thus far failed in getting consent
>>>>> to
>>>>> release under ASL.  So at this point, I am looking for an alternative
>>>>> good
>>>>> algorithm to implement.  All suggestions / unencumbered patches welcome!
>>>>>
>>>>> See comments on the MATH-431 for other questions.
>>>>>
>>>> Just to be sure of what you mean:
>>>> Do you want to have a two-sample Kolmogorov-Smirnov test for equality
>>>> of distributions in addition to the Mann-Whitney? Or do you need the
>>>> Kolmogorov-Smirnov distribution (as stated for example at
>>>>
>>>>
>>>> http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov_distribution
>>>> ) in regards to the MATH-428? Sorry, but I'm at bit confused :-).
>>>
>>> The goal is to implement the KS test for equality of distributions (or
>>> homogeneity against a reference distribution).  To do that we need at
>>> least
>>> critical values of the Kolmogorov distribution.  The natural way for us to
>>> do that would be to implement the full distribution which would be nice to
>>> have in the distributions package.
>>>
>>> Phil
>>
>> Have you read "Evaluating Kolmogorov’s Distribution" by Marsaglia et
>> al. available on http://www.jstatsoft.org/v08/i18/paper ? And do you
>> think their approach would be the way to go?
>>
>> I am not sure it is best.  See the comments here:
>> http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf
>>
>> Phil
> Thanks. It looks quite thorough, indeed. Was it the Java
> implementation you didn't get a consent to release under ASL?
>>
Yes.  I am interested in your and others' opinions on the various 
algorithms reviewed there.  Could be the Marsaglia reference above 
is adequate for a start.

Phil
>>
>>>>>>>
>>>>>>> Interesting approach for the exact algorithm for Wilcoxon.  If we stay
>>>>>>> with this, we should ack the original author of the algorithm in the
>>>>>>> javadoc.  Looks OK to use.

>>>>>>
>>>>>> Agree - both on the approach and legal part! Does the author need to
>>>>>> sign anything but write a mail?
>>>>>>>
>>>>>>>   Regarding the difference from R, what I usually do in this case is
>>>>>>> look
>>>>>>> at the R sources to try to explain the difference.  Most likely in
>>>>>>> this
>>>>>>> case, what is going on is they are using a different estimation
>>>>>>> algorithm
>>>>>>> for small n or treating ties differently.  The ranking options that we
>>>>>>> use
>>>>>>> were largely adapted from R, so if that is the problem, it should be
>>>>>>> easy to
>>>>>>> test.  We need to convince ourselves that ours is better or at least a
>>>>>>> legitimate alternative.  I will take a close look this evening, but it
>>>>>>> looks
>>>>>>> like the algorithm you are using should be exact.  If we can't
>>>>>>> reconcile the
>>>>>>> difference with R, it would be good to find a way to validate correct
>>>>>>> functioning of the algorithm by manufacturing reference data with
>>>>>>> known
>>>>>>> p.
>>>>>>
>>>>>> I'll try to investigate the difference, hopefully tomorrow, so that
>>>>>> formal tests can be written and included.
>>>>>>>
>>>>>>>> New tests: Wilcoxon signed-rank test and Mann-Whitney U
>>>>>>>> -------------------------------------------------------
>>>>>>>>
>>>>>>>>                 Key: MATH-431
>>>>>>>>                 URL: https://issues.apache.org/jira/browse/MATH-431
>>>>>>>>             Project: Commons Math
>>>>>>>>          Issue Type: New Feature
>>>>>>>>            Reporter: Mikkel Meyer Andersen
>>>>>>>>            Assignee: Mikkel Meyer Andersen
>>>>>>>>            Priority: Minor
>>>>>>>>         Attachments: MannWhitneyUTest.java, MannWhitneyUTestImpl.java,
>>>>>>>> WilcoxonSignedRankTest.java, WilcoxonSignedRankTestImpl.java
>>>>>>>>
>>>>>>>>   Original Estimate: 4h
>>>>>>>>   Remaining Estimate: 4h
>>>>>>>>
>>>>>>>> Wilcoxon signed-rank test and Mann-Whitney U are commonly used
>>>>>>>> non-parametric statistical hypothesis tests (e.g. instead of various
>>>>>>>> t-tests
>>>>>>>> when normality is not present).
>>>>>>>
>>>>>>> --
>>>>>>> This message is automatically generated by JIRA.
>>>>>>> -
>>>>>>> You can reply to this email to add a comment to the issue online.
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org