You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Otmar Ertl (JIRA)" <ji...@apache.org> on 2014/10/05 21:41:33 UTC

[jira] [Updated] (MATH-1154) Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization

     [ https://issues.apache.org/jira/browse/MATH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otmar Ertl updated MATH-1154:
-----------------------------
    Attachment: math3.patch

This patch demonstrates a fix using lazy initialization of default random number generator instances. Furthermore a test is included which gave following results before
{noformat}
statistical tests performance test (calls per timed block: 100000, timed blocks: 100, time unit: ms)
           name      time/call      std error total time      ratio      difference
binomial test 1 1.38289492e-02 1.71975630e-04 1.3829e+05 1.0000e+00  0.00000000e+00
binomial test 2 1.38270752e-02 1.61613547e-04 1.3827e+05 9.9986e-01 -1.87395300e+01
chi square test 2.67553017e-02 2.29903602e-04 2.6755e+05 1.9347e+00  1.29263525e+05
{noformat}
and after
{noformat}
statistical tests performance test (calls per timed block: 100000, timed blocks: 100, time unit: ms)
           name      time/call      std error total time      ratio      difference
binomial test 1 7.26630369e-04 5.87472596e-05 7.2663e+03 1.0000e+00  0.00000000e+00
binomial test 2 7.27780967e-04 2.44728991e-05 7.2778e+03 1.0016e+00  1.15059780e+01
chi square test 5.21210430e-04 3.14044354e-05 5.2121e+03 7.1730e-01 -2.05419939e+03
{noformat}
the fix. A speedup up to a factor of 20 can be seen.

> Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1154
>                 URL: https://issues.apache.org/jira/browse/MATH-1154
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.3
>            Reporter: Otmar Ertl
>         Attachments: math3.patch
>
>
> Some statistical tests defined in the stat.inference package (e.g. BinomialTest or ChiSquareTest) are unnecessarily very slow (up to a factor 20 slower than necessary). The reason is the implicit slow initialization of a default (Well19937c) random generator instance each time a test is performed. The affected tests create some distribution instance in order to use some methods defined therein. However, they do not use any method for random generation. Nevertheless a random number generator instance is automatically created when creating a distribution instance, which is the reason for the serious slowdown. The problem is related to MATH-1124.
> There are following solutions:
> 1) Fix the affected statistical tests by passing a light-weight RandomGenerator implementation (or even null) to the constructor of the distribution.
> 2) Or use for all distributions a RandomGenerator implementation that uses lazy initialization to generate the Well19937c instance as late as possible. This would also solve MATH-1124.
> I will attach a patch proposal together with a performance test, that will demonstrate the speed up after a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)