You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex D Herbert (JIRA)" <ji...@apache.org> on 2019/07/16 23:43:00 UTC

[jira] [Comment Edited] (RNG-95) DiscreteUniformSampler

    [ https://issues.apache.org/jira/browse/RNG-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886537#comment-16886537 ] 

Alex D Herbert edited comment on RNG-95 at 7/16/19 11:42 PM:
-------------------------------------------------------------

This method has been documented by Daniel Lemire:

[Fast Random Integer Generation in an Interval|https://arxiv.org/abs/1805.10941]


was (Author: alexherbert):
This method has been documented by Danial Lamire:

[Fast Random Integer Generation in an Interval|https://arxiv.org/abs/1805.10941]

> DiscreteUniformSampler
> ----------------------
>
>                 Key: RNG-95
>                 URL: https://issues.apache.org/jira/browse/RNG-95
>             Project: Commons RNG
>          Issue Type: Improvement
>          Components: sampling
>    Affects Versions: 1.3
>            Reporter: Alex D Herbert
>            Assignee: Alex D Herbert
>            Priority: Minor
>
> The {{DiscreteUniformSampler}} delegates the creation of an integer in the range {{[0, n)}} to the {{UniformRandomProvider}}.
> This sampler will be repeatedly used to sample the same range. The default method in {{BaseProvider}} uses a dynamic algorithm that handles {{n}} differently when a power of 2.
> When the range is a power of 2 the method can use a series of bits from a random integer to generate a uniform integer in the range. This is fast.
> When the range is not a power of 2 the algorithm must reject samples when the sample would result in an over-representation of a particular value in the uniform range. This is necessary as {{n}} does not exactly fit into the number of possible values {{[0, 2^31)}} that can be produced by the generator (when using 31-bit signed integers). The rejection method uses integer arithmetic to determine the number of samples that fit into the range: {{samples = 2^31 / n}}. Extra samples that lead to over-representation are rejected: {{extra = 2^31 % n}}.
> Since {{n}} will not change a pre-computation step is possible to select the best algorithm. 
> n is a power of 2:
> {code:java}
> // Favouring the least significant bits
> // Pre-compute
> int mask = n - 1;
> return nextInt() & mask;
> // Or favouring the most significant bits
> // Pre-compute
> int shift = Integer.numberOfLeadingZeros(n) + 1;
> return nextInt() >>> shift;
> {code}
> n is not a power of 2:
> {code:java}
> // Sample using modulus
> // Pre-compute
> final int fence = (int)(0x80000000L - 0x80000000L % n - 1);
> int bits;
> do {
>     bits = rng.nextInt() >>> 1;
> } while (bits > fence);
> return bits % n;
> // Or using 32-bit unsigned arithmetic avoiding modulus
> // Pre-compute
> final long fence = (1L << 32) % n;
> long result;
> do {
>     // Compute 64-bit unsigned product of n * [0, 2^32 - 1)
>     result = n * (rng.nextInt() & 0xffffffffL);
>     // Test the sample uniformity.
> } while ((result & 0xffffffffL) < fence);
> // Divide by 2^32 to get the sample
> return (int)(result >>> 32);
> {code}
> The second method uses a range of 2^32 instead of 2^31 so reducing the rejection probability and avoids the modulus operator; these both increase speed.
> Note algorithm 1 returns sample values in a repeat cycle from all values in the range {{[0, 2^31)}} due to the use of modulus, e.g.
> {noformat}
> 0, 1, 2, ..., 0, 1, 2, ...
> {noformat}
> Algorithm 2 returns sample values in a linear order, e.g.
> {noformat}
> 0, 0, 1, 1, 2, 2, ...
> {noformat}
> The suggested change is to implement smart pre-computation in the {{DiscreteUniformSampler}} based on the range and use the algorithms that favour the most significant bits from the generator.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)