You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org> on 2010/11/11 16:11:14 UTC

[jira] Created: (MATH-437) Kolmogorov Smirnov Distribution

Kolmogorov Smirnov Distribution
-------------------------------

                 Key: MATH-437
                 URL: https://issues.apache.org/jira/browse/MATH-437
             Project: Commons Math
          Issue Type: New Feature
            Reporter: Mikkel Meyer Andersen
            Assignee: Mikkel Meyer Andersen
            Priority: Minor


Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.

[1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Gilles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles updated MATH-437:
------------------------

    Fix Version/s:     (was: 3.1)
                   3.2
    
> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment: MATH437-with-test-take-1

A proposal based on the method mainly by Marsaglia et al. optimised for extreme values as described by Simard et al. This method delivers consistent results comparing to R on the test cases tried.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933161#action_12933161 ] 

Mikkel Meyer Andersen commented on MATH-437:
--------------------------------------------

Richard,
Thanks for your knowledgeable comment. To quote you:
{quote}"The argument x that you used in the Simard-L'écuyer program is not the same that you used for the other two programs."{quote}
I'm not sure what you mean by that? Which argument should I then use to expect the same result?

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment: KolmogorovSmirnovDistribution.java

An error corrected. Now the results are consistent with those of R and implementation by Simard and L'Ecuyer.

The implementation can be made faster, but then n approx. larger than 140 cannot be dealt with.

Right now two efficient power functions are implemented because no matrix interface dictating add/multiply etc. exists.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java, KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Richard Simard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933077#action_12933077 ] 

Richard Simard commented on MATH-437:
-------------------------------------

   http://www.mail-archive.com/issues@commons.apache.org/msg15829.html

F(n, x) = F(200, 0.031111):
                                 Lecuyer (2.0 ms.) = 0.012916146481628863
 KolmogorovSmirnovDistribution exact (51902.0 ms.) = 0.012149763742041911
    KolmogorovSmirnovDistribution !exact (9.0 ms.) = 0.012149763742041922


The argument x that you used in the Simard-L'écuyer program is not the same that you used for the other two programs. Of course you then get
very different results. If I compute exactly in Mathematica, I obtain

F(200, 0.031111) = 0.0129161464816289

which is very different than your exact results above and agrees well with our program.



=================================================
  Richard Simard    <si...@iro.umontreal.ca>
  Laboratoire de simulation et d'optimisation
  Université de Montréal, IRO



> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment:     (was: KolmogorovSmirnovDistribution.java)

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Richard Simard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933166#action_12933166 ] 

Richard Simard commented on MATH-437:
-------------------------------------

If you used the same x in all 3 cases, I believe there is a bug in your exact and not exact codes because you get
only 2 decimal digits of precision.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment: KolmogorovSmirnovDistribution.java

Implementation of evaluating distribution of two sided test statistics.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Phil Steitz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phil Steitz updated MATH-437:
-----------------------------

    Fix Version/s:     (was: 3.0)
                   3.1
         Assignee:     (was: Phil Steitz)
    
> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment: KolmogorovSmirnovDistribution.java

Implementation of evaluating distribution of two sided test statistics.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Reopened] (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Phil Steitz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phil Steitz reopened MATH-437:
------------------------------

      Assignee: Phil Steitz  (was: Mikkel Meyer Andersen)

The implementation class is in the distribution package, but does not implement a distribution interface.  We should either implement the missing methods or move this class (probably renamed) to .stat.inference.  In either case, we should implement methods making it easy to set up and execute K-S tests.
                
> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Phil Steitz
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Phil Steitz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008936#comment-13008936 ] 

Phil Steitz commented on MATH-437:
----------------------------------

+1 to commit as is, adding some algorithm notes to the class javadoc and the MATH-435 power impl.  I am ambivalent on whether or not to "fix" the error in Marsaglia's code that is apparently included in R.  Having the verification tests is good, though, so I would leave as is in the patch, since the Marsaglia C impl can be seen as a reference in this case.  I can see the other side of the argument here, though and would be fine with just going with the fixed code, suitably documented.  What do others think about this?

It looks like you forgot to add the references to the class javadoc for the impl class.

Per comment on MATH-435, I think we should add the matrix power impl there and use it here.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen resolved MATH-437.
----------------------------------------

    Resolution: Fixed

Fixed in revision 1083716. Impl-class now mentions {@link KolmogorovSmirnovDistribution}, if that was what you meant, Phil? Power functionality from MATH-435 now used and the private methods removed. Note that - as before - the tests are made using src/test/R/KolmogorovSmirnovDistributionTestCases.R.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment:     (was: KolmogorovSmirnovDistribution.java)

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932361#action_12932361 ] 

Mikkel Meyer Andersen commented on MATH-437:
--------------------------------------------

The last part of the roundedK kan be replaced with
{{
double pFrac = Hpower.getEntry(k - 2, k - 2);

for (int i = 1; i <= n; ++i) {
        pFrac *= (double)i / (double)n;
}

return pFrac;
}}
to get even better running time and still precise results:
{{
F(n, x) = F(200, 0.02):
                                 Lecuyer (3.0 ms.) = 5.151982014280042E-6
   KolmogorovSmirnovDistribution exact (760.0 ms.) = 5.15198201428005E-6
   KolmogorovSmirnovDistribution !exact (16.0 ms.) = 5.151982014280049E-6
-------------------------


F(n, x) = F(200, 0.031111):
                                 Lecuyer (2.0 ms.) = 0.012916146481628863
 KolmogorovSmirnovDistribution exact (51902.0 ms.) = 0.012149763742041911
    KolmogorovSmirnovDistribution !exact (9.0 ms.) = 0.012149763742041922
-------------------------


F(n, x) = F(200, 0.04):
                                 Lecuyer (0.0 ms.) = 0.1067121882956352
  KolmogorovSmirnovDistribution exact (5903.0 ms.) = 0.10671370113626812
    KolmogorovSmirnovDistribution !exact (6.0 ms.) = 0.10671370113626813
-------------------------
}}

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>         Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikkel Meyer Andersen updated MATH-437:
---------------------------------------

    Attachment:     (was: KolmogorovSmirnovDistribution.java)

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Phil Steitz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phil Steitz updated MATH-437:
-----------------------------

    Fix Version/s: 3.0

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.0
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

Posted by "Mikkel Meyer Andersen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12975771#action_12975771 ] 

Mikkel Meyer Andersen commented on MATH-437:
--------------------------------------------

In the past months, I've communicated with both Richard Simard and George Marsaglia regarding small disagreement between theory in Marsaglia's article and the actual implementation; namely the fact that 0 <= h < 1, but in the code 0 < h <= 1. I wrote to Marsaglia regarding this, and his answer was: 
{quote}
The Kolmogorov distribution comes from a piecewise polynomial in h with knots at 1/2n, 2/2n,...,(2n-1)/2n,  with each segment assumed to start with h=0. Although I emphasized that 0<= h <1 in the article,  I overlooked the need for ensuring that in the C code, and apparently so did my colleagues. Sorry about that.
{quote}
This means that his code has to be changed slightly to ensure that 0 <= h < 1. Simard argues that this shouldn't mean anything because KS distribution is continuous, but if one wants to correct it, one should
{quote}
Instead of taking the floor(n*d + 1) and making this correction for h = 1, take the ceiling (n*d).
{quote}

I would prefer using ceiling (n*d) instead of the originally (wrongly) proposed floor(n*d + 1), despite arguments of continuity. So my plan is to do this (I still have my implementation which seem to work quite okay). The only problem is that R seems to use Marsaglia's code, and I don't have access to e.g. Mathematica which should implement several algorithms, so I might run into problems when I have to perform tests.

> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>             Fix For: 3.0
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability density functions or if two samples are from the same distribution. To evaluate the test statistic, the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.