You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex Herbert (Jira)" <ji...@apache.org> on 2022/09/19 19:14:00 UTC

[jira] [Resolved] (RNG-173) BaseProvider state filling procedure can be improved

     [ https://issues.apache.org/jira/browse/RNG-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Herbert resolved RNG-173.
------------------------------
    Fix Version/s: 1.5
       Resolution: Implemented

> BaseProvider state filling procedure can be improved
> ----------------------------------------------------
>
>                 Key: RNG-173
>                 URL: https://issues.apache.org/jira/browse/RNG-173
>             Project: Commons RNG
>          Issue Type: Improvement
>            Reporter: Alex Herbert
>            Priority: Trivial
>             Fix For: 1.5
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The BaseProvider has a method to fill in remaining state if the input seed is too short. The fill uses existing seed values to fill the remaining.
> The next state is created using:
> {code:java}
> long n = state[i - seed.length];
> state[i] = 1812433253L * (n ^ (n >> 30)) + i{code}
> If the existing state is zero then the new state is i. When the input seed has no length then the filled state is a natural sequence.
> Here is a state of 10 filled from empty seeds of length 0 to 5:
> {noformat}
> 0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
> 1: [0, 1, 1812433255, 3284914298392595265, 6102061520201954364, -3308799481182342998, -3869692221293809580, -7101959917617921332, 7986832403292652032, 8936067391732911773]
> 2: [0, 0, 2, 3, 3624866510, 5437299764, 6569828598597623783, -8592001180344199076, 1136775338421644002, 8717367692712810396]
> 3: [0, 0, 0, 3, 4, 5, 5437299765, 7249733019, 9062166273, -8592001182156632327]
> 4: [0, 0, 0, 0, 4, 5, 6, 7, 7249733020, 9062166274]
> 5: [0, 0, 0, 0, 0, 5, 6, 7, 8, 9]
> {noformat}
> When the seed is zero length or close to half the length of the desired state and all zeros then the output state has a low number of non-zero bits.
> Note:
> This has little impact when using the Commons RNG simple module to create a generator. The seed is produced to the correct length using a high quality random source.
> A second issue is that the method to fill the state is an instance method. Since it uses no state it could be a static method. I would suggest a method to convert a seed to the correct length:
> {code:java}
> protected static long[] ensureSeedLength(long[] seed, int length); {code}
> This would allow classes that implement the following pattern:
> {code:java}
> MyRNG(long[] seed) {
>     if (seed.length < SEED_SIZE) {
>         final long[] state = new long[SEED_SIZE];
>         fillState(state, seed);
>         setState(state);
>     } else {
>         setState(seed);
>     }
> } {code}
> To simplify to:
> {code:java}
> MyRNG(long[] seed) {
>     setState(ensureSeedLength(seed, SEED_SIZE));
> }{code}
> h2. Compatibility
> The user guide states:
> {noformat}
> upon initialization, the underlying generation algorithm
> - may not use all the information contents of the seed,
> - may use a procedure (using the given seed as input) for further filling its internal state (in order to avoid a too uniform initial state).
> In both cases, the behavior is not standard but should not change between releases of the library (bugs notwithstanding).{noformat}
> Since behaviour *should not change* it would rule out changes for existing classes. New classes could use the new static version to fill state.
> I would suggest providing a new method to ensure the input seed is a minimum length. If the method seeds a SplitMix64 style generator with the first value of the input seed (or zero if the seed length is zero) then the filled state will be high quality. This type of generator only outputs zero once during the period and so any seed length can be ensured to be non zero when it has been expanded. An input seed of entirely zero values would be passed through unchanged. This is the default *user beware* behaviour for full length zero seeds.
> A 32-bit variant can be created using a similar hashing function that outputs only a single 0 in the period, for example MurmurHash3's 32-bit finaliser function.
> An example implementation for long values is:
> {code:java}
> private static final long GOLDEN_RATIO = 0x9e3779b97f4a7c15L
> protected static long[] ensureSeedLength(long[] seed, int length) {
>     if (seed.length < length) {
>         final long[] s = Arrays.copyOf(seed, length);
>         // Fill the rest as if using a SplitMix64 RNG
>         long x = s[0];
>         for (int i = seed.length; i < length; i++) {
>             s[i] = stafford13(x += GOLDEN_RATIO);
>         }
>         return s;
>     }
>     return seed;
> }
> private static long stafford13(long x) {
>     x = (x ^ (x >>> 30)) * 0xbf58476d1ce4e5b9L;
>     x = (x ^ (x >>> 27)) * 0x94d049bb133111ebL;
>     return x ^ (x >>> 31);
> }
> {code}
> A 32-bit mix function for Murmur32 is:
> {code:java}
> private static int murmur32(int x) {
>     x = (x ^ (x >>> 16)) * 0x85ebca6b;
>     x = (x ^ (x >>> 13)) * 0xc2b2ae35;
>     return x ^ (x >>> 16);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)