You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex Herbert (Jira)" <ji...@apache.org> on 2021/06/29 13:47:00 UTC

[jira] [Commented] (RNG-147) LevySampler

    [ https://issues.apache.org/jira/browse/RNG-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371381#comment-17371381 ] 

Alex Herbert commented on RNG-147:
----------------------------------

I created a LevySampler using a ZigguratNormalizedGaussianSampler to generate the normalised Gaussian deviate.

Parameter order is (location, scale), or (mu, c). This matches the nomenclature Levy(mu, c) on Wikipedia, several other sources and in Commons Math and Commons Statistics for the LevyDistribution.
{code:java}
UniformRandomProvider rng = ...;
double location = 1.23;
double scale = 4.56;
LevySampler sampler = LevySampler.of(rng, location, scale);
{code}
As discussed recently the API is fluent for the SharedStateSampler returning the LevyDistribution:
{code:java}
LevySampler sampler1 = LevySampler.of(...);
UniformRandomProvider rng = ...;
LevySampler sampler2 = sampler1.withUniformRandomProvider(rng);
{code}
h2. Performance

The sampler was compared using JMH to an inverse transform sampler using the LevyDistribution from Commons Math to compute the inverse CDF. Times have been adjusted for the baseline in the normalised column:
||RandomSource||Sampler||Score||Normalised||Relative||
|baseline| | 2.468418| | |
|SPLIT_MIX_64|InverseTransformDiscreteSampler|105.03|102.56| |
|SPLIT_MIX_64|LevySampler|11.26|8.79|0.09|
|MWC_256|InverseTransformDiscreteSampler|104.72|102.25| |
|MWC_256|LevySampler|13.68|11.21|0.11|
|JDK|InverseTransformDiscreteSampler|119.44|116.97| |
|JDK|LevySampler|27.89|25.42|0.22|
 * The new sampler is about 5-10x faster.
 * The LevySampler slows down as the RNG gets slower. This is because a fair amount of the sampling time is spent creating random variates for the underlying fast normalised Gaussian sampler.
 * The inverse transform method does not slow down much with the RNG. Most of the time is spent in computing the inverse CDF of the Levy distribution using the error function.

h2. Distribution Support

The sample bounds for a zero location and unit scale Levy distribution should be [0, inf).

The ZigguratNormalizedGaussianSampler will output a value of zero if the underlying RNG creates a long value of 0. This will hit the upper bound of infinity: 1/ (0*0) = inf. 

The ZigguratNormalizedGaussianSampler will output a value of +/-infinity if the underlying RNG produces a long value close to the maximum positive value (the lowest 7 bits must be zero) and then the next double should be zero. This will hit the lower bound of 0: 1/ (inf*inf) = 0.

A check for this has been added to the LevySamplerTest.

 

> LevySampler
> -----------
>
>                 Key: RNG-147
>                 URL: https://issues.apache.org/jira/browse/RNG-147
>             Project: Commons RNG
>          Issue Type: New Feature
>          Components: sampling
>    Affects Versions: 1.3
>            Reporter: Alex Herbert
>            Priority: Minor
>
> [Sampling from a Levy distribution|https://en.wikipedia.org/wiki/L%C3%A9vy_distribution#Random_sample_generation] is done using an inverse transform of the cumulative distribution function of the standard normal distribution.
> {noformat}
> Levy(Z) =          1 
>           -------------------
>           (inv CDF_norm(u))^2
> {noformat}
> With u a uniform deviate in [0, 1). An alternative is direct generation of a uniform normal variate with mean 0 and standard deviation 1: N(0, 1):
> {noformat}
> Levy(Z) =    1 
>           --------
>           N(0,1)^2
> {noformat}
> This should be faster than inverse transform sampling if generation of the normal distribution sample is faster than computation of the inverse cumulative probability function.
> This sampler can be used in Commons Statistics for the Levy distribution.
> The extremes of the support should be investigated, i.e. what is the maximum value for a sample from a standard normal distribution such as the ZigguratNormalizedGaussianSampler vs the maximum value of the inverse CDF of the normal distribution when the uniform deviate is at the upper limit of 1 - 2^-53.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)