You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Thomas Neidhart (JIRA)" <ji...@apache.org> on 2014/10/05 22:06:34 UTC

[jira] [Commented] (MATH-1154) Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization

    [ https://issues.apache.org/jira/browse/MATH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159667#comment-14159667 ] 

Thomas Neidhart commented on MATH-1154:
---------------------------------------

The lazy initialization of the random generator makes sense imho.

I wonder if it would not also be a good idea to refactor the WellXXX random generators. Right now, every time we instantiate one of these a lot of computations are performed although most of them are always the same regardless of the chosen seed. I think it would be better to have a static data object for each WellXXX generator type containined the fields of the abstract base class, and this has to be initialized only once. This would also safe quite some memory. The only fields that need to be stored for each instance are the index and v fields.

> Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1154
>                 URL: https://issues.apache.org/jira/browse/MATH-1154
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.3
>            Reporter: Otmar Ertl
>         Attachments: math3.patch
>
>
> Some statistical tests defined in the stat.inference package (e.g. BinomialTest or ChiSquareTest) are unnecessarily very slow (up to a factor 20 slower than necessary). The reason is the implicit slow initialization of a default (Well19937c) random generator instance each time a test is performed. The affected tests create some distribution instance in order to use some methods defined therein. However, they do not use any method for random generation. Nevertheless a random number generator instance is automatically created when creating a distribution instance, which is the reason for the serious slowdown. The problem is related to MATH-1124.
> There are following solutions:
> 1) Fix the affected statistical tests by passing a light-weight RandomGenerator implementation (or even null) to the constructor of the distribution.
> 2) Or use for all distributions a RandomGenerator implementation that uses lazy initialization to generate the Well19937c instance as late as possible. This would also solve MATH-1124.
> I will attach a patch proposal together with a performance test, that will demonstrate the speed up after a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)