You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Thomas Neidhart (JIRA)" <ji...@apache.org> on 2015/04/30 22:53:06 UTC
[jira] [Comment Edited] (MATH-1153) Sampling from a
'BetaDistribution' is slow
[ https://issues.apache.org/jira/browse/MATH-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520038#comment-14520038 ]
Thomas Neidhart edited comment on MATH-1153 at 4/30/15 8:52 PM:
----------------------------------------------------------------
After fixing the KS inference tests the respective test failures disappeared as expected.
The remaining test failure in testNextInversionDeviate is because the Cheng sampler uses a kind of rejection sampling method and will consume more randomness from the provided RandomGenerator.
This is a recurring issue, as also for other distributions there are improved sampling methods that consume more randomness (see MATH-1220 for the Zipf distribution).
This also relates to MATH-1158 as it proposes a different way to create a sampler for a distribution. This would probably also allow to provide different samplers using a common interface, e.g. the default one uses the inverse transform method while more optimized ones could be available which require different assumptions, e.g. wrt the RandomGenerator.
was (Author: tn):
After fixing the KS inference tests the respective test failures disappeared as expected.
The remaining test failure in testNextInversionDeviate is because the Cheng sampler uses a kind of rejection sampling method and will consume more randomness from the provided RandomGenerator.
This is a recurring issue, as also for other distributions there are improved sampling methods that consume more randomness (see MATH-1220 for the Zipf distribution).
This also relates to MATH-1153 as it proposes a different way to create a sampler for a distribution. This would probably also allow to provide different samplers using a common interface, e.g. the default one uses the inverse transform method while more optimized ones could be available which require different assumptions, e.g. wrt the RandomGenerator.
> Sampling from a 'BetaDistribution' is slow
> ------------------------------------------
>
> Key: MATH-1153
> URL: https://issues.apache.org/jira/browse/MATH-1153
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Sergei Lebedev
> Priority: Minor
> Fix For: 4.0
>
> Attachments: ChengBetaSampler.java, ChengBetaSampler.java, ChengBetaSamplerTest.java
>
>
> Currently the `BetaDistribution#sample` uses inverse CDF method, which is quite slow for sampling-intensive computations. I've implemented a method from the R. C. H. Cheng paper and it seems to work much better. Here's a simple microbenchmark:
> {code}
> o.j.b.s.SamplingBenchmark.algorithmBCorBB 1e-3 1000 thrpt 5 2592200.015 14391.520 ops/s
> o.j.b.s.SamplingBenchmark.algorithmBCorBB 1000 1000 thrpt 5 3210800.292 33330.791 ops/s
> o.j.b.s.SamplingBenchmark.commonsVersion 1e-3 1000 thrpt 5 31034.225 438.273 ops/s
> o.j.b.s.SamplingBenchmark.commonsVersion 1000 1000 thrpt 5 21834.010 433.324 ops/s
> {code}
> Should I submit a patch?
> R. C. H. Cheng (1978). Generating beta variates with nonintegral shape parameters. Communications of the ACM, 21, 317–322.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)