You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Gilles (JIRA)" <ji...@apache.org> on 2018/11/15 16:38:00 UTC

[jira] [Resolved] (RNG-52) PoissonSampler allows mean above Integer.MAX_VALUE

     [ https://issues.apache.org/jira/browse/RNG-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles resolved RNG-52.
-----------------------
    Resolution: Fixed

Conservative limit set in commit 8c927dc65dc5d86daa16304a83b033b25c43fd28 ("master" branch).

> PoissonSampler allows mean above Integer.MAX_VALUE
> --------------------------------------------------
>
>                 Key: RNG-52
>                 URL: https://issues.apache.org/jira/browse/RNG-52
>             Project: Commons RNG
>          Issue Type: Bug
>          Components: sampling
>    Affects Versions: 1.1
>            Reporter: Alex D Herbert
>            Priority: Major
>             Fix For: 1.2
>
>
> The {{PoissonSampler}} is limited to returning an integer by the interface of the {{DiscreteSampler}}. As it stands an input mean above {{Integer.MAX_VALUE}} is allowed although it makes no sense as the Poisson distribution is significantly truncated.
> The algorithm of the {{SmallMeanPoissonSampler}} sets a limit on the returned sample of {{Integer.MAX_VALUE}}. The algorithm is valid although run-time would be impractical due to the nature of the algorithm. However at high mean (>40) the end user is expected to use either the {{LargeMeanPoissonSampler}} directly or the {{PoissonSampler}} which chooses the appropriate large mean algorithm.
> However the current {{LargeMeanPoissonSampler}} uses {{(int)Math.floor(mean)}} during initialisation and any mean above {{Integer.MAX_VALUE}} would therefore be unsupported.
> I propose to add this to the constructor of each Poisson sampler:
> {code:java}
> if (mean > Integer.MAX_VALUE) {
>     throw new IllegalArgumentException(mean + " > " + Integer.MAX_VALUE);
> }
> {code}
> with documentation
> {code:java}
>  * @throws IllegalArgumentException if {@code mean <= 0} or {@code mean > }{@link Integer.MAX_VALUE}.
> {code}
> It is noted that the limit of {{Integer.MAX_VALUE}} would allow the samples to reflect the Poisson distribution below that level but truncate it above that level to represent the remaining cumulative histogram at the single point of {{Integer.MAX_VALUE}}. This maintains the functionality of the sampler within the contract of the integer value returned by {{DiscreteSampler}}.
> In practice the Poisson distribution is unlikely to be used at such a high mean; in this case it is appropriate to use a Gaussian approximation to the Poisson.
> Note: Currently there is no code coverage from tests for the \{{LargeMeanPoissonSampler}} checking if the mean is <= 0. Tests should be added to check the constructor does throw when a bad mean is used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)