You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Alex Herbert <al...@gmail.com> on 2019/03/08 17:12:33 UTC

[Rng] Merge outstanding PRs for construction speed baseline

Hi,

I'd like to start on improving the methods in rng-simple to create the 
RandomSource.

There are a few PRs outstanding but I think the discussion on this list 
was that the new XoShiRo generators are OK. So I would like to merge the 
following PRs:

RNG-82: Add XorShift1024StarPhi generator
RNG-70: Add new XoShiRo generators

Then use master to set a baseline for the construction speed. This can 
be used to compare against the latest code to see how much the changes 
have improved construction speed. Given the work will have a big effect 
on generators with small state it would be good to have the new 
generators in the baseline.

I note that the only tags in git are for releases. So the baseline can 
be just a commit Id and I'll add it to the benchmark Jira RNG-72 for 
reference.

There are also some other PRs to be discussed:

RNG-81: NumberFactory to sample all rationals between 0 and 1.

This one changes to the implementation in Open JDK for float and double 
generation. I think this is OK to merge. It won't effect the baseline 
but is good to get it into the next release.


RNG-78: Added a ThreadLocalRandomSource.

This works as a proof of concept. But it is all new files and so I'm 
happy to leave that until it is decided exactly if and where to put it.


RNG-76: Added primitive long constructor to SplitMix64

This is something that will effect the benchmark so I'll merge it after. 
It is a trivial change with a noticeable performance gain by avoiding 
auto-boxing of the long seed.


Regards,

Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Rng] Merge outstanding PRs for construction speed baseline

Posted by Alex Herbert <al...@gmail.com>.
I think the summary of this thread on outstanding PRs is:

> RNG-82: Add XorShift1024StarPhi generator
> RNG-70: Add new XoShiRo generators

Merge but:

- Ensure those generators with more than one constructor build the same sequence if the seed data is the same. This is to satisfy package requirement that "The "provider" must specify one way for setting the seed. For a given seed, the generated sequence must always be the same.”

The new generators have utility constructors where for example an int[] seed of length 2 can be specified as 'int seed0, int seed1'. This is just a different way to specify and set the same exact seed. So this is OK. However using 'long seed0, long seed1' would not be OK as this is not the same seed data.


- Test the correlation between XorShift1024Star and XorShift1024StarPhi and possibly mark XOR_SHIFT_1024_S deprecated

The test for this is outstanding. A composite fails DieHarder. It passes BigCrush but there is a possible issue with passing the int sequence to the TestU01 BigCrush test suite that is under investigation.

> 
> RNG-81: NumberFactory to sample all rationals between 0 and 1.

Merge.

> 
> RNG-78: Added a ThreadLocalRandomSource.

Merge.


Then update all the benchmarks to include the new providers.

Note: A separate thread is discussing the order for benchmark reports. This is whether the order of the generators in the quality benchmarks should match the order of the RandomSource enum. Currently it does not (this requires some simple file rearrangement) but all new generators should be added to the end of the benchmark list so the existing report output files are not affected.


Save the commit tag for a speed benchmark reference.

> RNG-76: Added primitive long constructor to SplitMix64

Merge as the first commit for improving the construction speed benchmarking.

Then update the ProviderBuilder with ideas to improve RNG construction… (Note: with no breaking changes ;) )

Alex


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Rng] Merge outstanding PRs for construction speed baseline

Posted by Gilles Sadowski <gi...@gmail.com>.
Le mer. 13 mars 2019 à 00:19, Alex Herbert <al...@gmail.com> a écrit :
>
>
>
> > On 12 Mar 2019, at 16:58, Gilles Sadowski <gi...@gmail.com> wrote:
> >
> > [Replying to the ML.]
> >
> > Le mar. 12 mars 2019 à 13:00, Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>> a écrit :
> >>
> >>
> >> On 11/03/2019 23:54, Gilles Sadowski wrote:
> >>> Le ven. 8 mars 2019 à 18:12, Alex Herbert <al...@gmail.com> a écrit :
> >>>> Hi,
> >>>>
> >>>> I'd like to start on improving the methods in rng-simple to create the
> >>>> RandomSource.
> >>>>
> >>>> There are a few PRs outstanding but I think the discussion on this list
> >>>> was that the new XoShiRo generators are OK. So I would like to merge the
> >>>> following PRs:
> >>>>
> >>>> RNG-82: Add XorShift1024StarPhi generator
> >>> See other thread.
> >>>
> >>>> RNG-70: Add new XoShiRo generators
> >>> +1
> >>
> >> Ah. Most of these generators have more than one constructor (see below
> >> on the discussion for the change to the SplitMix64 generator). These are
> >> utility constructors given the small size of the seed. e.g. for a seed
> >> of length 4:
> >>
> >> public XoShiRo256StarStar(long[] seed)
> >>
> >> public XoShiRo256StarStar(long seed0, long seed1, long seed2, long seed3)
> >>
> >> I did not see this as an issue. I added unit tests to show that the same
> >> input data in either the array or the primitives produces the same
> >> generator sequence. The primitive constructor does not have to do array
> >> checking and filling when the array is not the correct length. The input
> >> is directly used to set the state. It is a cleaner constructor, but it
> >> does not fit the model required by the ProviderBuilder which would like
> >> an object seed.
> >>
> >> I am going to rewrite the ProviderBuilder internals to allow the size of
> >> the seed to be known and to remove the use of reflection to create a
> >> source. In this instance the primitive constructor would be better. That
> >> is a motivating reason for the change to the SplitMix64 too.
> >
> > I did not look again at the code yet; but if you can work around
> > the issue (as I recalled it) without impacting the public API, then
> > fine I guess. :-)
> >
> >>
> >>>
> >>>> Then use master to set a baseline for the construction speed. This can
> >>>> be used to compare against the latest code to see how much the changes
> >>>> have improved construction speed. Given the work will have a big effect
> >>>> on generators with small state it would be good to have the new
> >>>> generators in the baseline.
> >>> IIUC, "big effect" only when constructing and throwing quickly very
> >>> many objects.
> >>> The doc should make it clear that it's not the "nominal" use-case.
> >>
> >> Yes. There is not a document for this at the moment anyway (i.e. the
> >> results of the construction benchmark). Should a section be added to the
> >> user guide?
> >
> > Benchmarks can certainly go in the "Performance" section.
> >
> >> This can outline that a generator should be picked for its
> >> intended purpose: output datatype required (int, double, etc) and number
> >> of samples (period). For short lived generators with low samples then
> >> construction speed is a factor.
> >
> > All good as long as the entry point for application developers to
> > create RNGs is "RandomSource".
> >
> >>
> >> This is a feature that will probably most benefit the
> >> ThreadLocalRandomSource where short-lifetime generators are likely to be
> >> required in a multi-threaded environment.
> >
> > ???
> > IIUC, "ThreadLocalRandomSource" will store the longest-living RNG
> > instances!
>
> But only per thread. A new thread will get a new RNG which has to be seeded again. But if you have an application with a few long term threads then the ThreadLocalRandomSource can be used to store and retrieve any RNG of choice without having to write extra code to do that yourself.

The latter is what I think is the "normal" usage.

>
> >
> >>
> >> However the benchmark shows that once you are generating 1,000,000
> >> samples then the construction cost of even the TwoCmres is much less
> >> than the sampling time.
> >>
> >>>
> >>>> I note that the only tags in git are for releases. So the baseline can
> >>>> be just a commit Id and I'll add it to the benchmark Jira RNG-72 for
> >>>> reference.
> >
> > I'm still wondering about applications that require short-lived
> > generators...
> > It doesn't seem to ever be a good idea to do that.  There was
> > an issue caused by that in [Digester], fixed only in the latest
> > release…
>
> Short lived generators for example for array shuffling of small arrays that will be done only once.

If it's only *once*, then the absolute time of the whole operation
(creation + sorting)
will be swamped in the total execution time of the application. ;-)
And if it's done repeatedly, I still think that reusing the RNG is a
better approach (i.e.
define a "Shuffler" class that will store a long-lived RNG instance).

>
> ThreadLocalRandomSource solves this by just giving the thread the current one, or a new one if the thread has never used one before. In the later case a low construction cost is good, especially if the thread is also short lived.

AFAIK, a short-lived thread is a waste of resources; better use an
executor, as you
indicate below.

> For a shuffle of 100 items then a SplitMix64 will do a good enough job and can be built very fast.
>
> For simulations then you should be using longer period generators. If doing a multithreaded simulation I use a new RNG per Runnable put into an ExecutorService. However the Runnable can now use ThreadLocalRandomSource.current(…) to get the RNG. So for a fixed thread pool there potentially can be less generators as they get reused, and the Runnable does not have to be written to do lots of work in a loop with the same RNG.

+1 :-)

>
> Parallel simulations is a case where the split() and jump() functionality would be better still to avoid sequence overlap.

Or use different generators, if the number of threads is small.
[Hence the question with "XorShift1024Star".]

Gilles

> >
> >>>>
> >>>> There are also some other PRs to be discussed:
> >>>>
> >>>> RNG-81: NumberFactory to sample all rationals between 0 and 1.
> >>>>
> >>>> This one changes to the implementation in Open JDK for float and double
> >>>> generation. I think this is OK to merge.
> >>> +1
> >>> [No promise of reproducibility for "derived" types.]
> >> OK. I'll put that in.
> >>>
> >>>> It won't effect the baseline
> >>>> but is good to get it into the next release.
> >>>>
> >>>>
> >>>> RNG-78: Added a ThreadLocalRandomSource.
> >>>>
> >>>> This works as a proof of concept. But it is all new files and so I'm
> >>>> happy to leave that until it is decided exactly if and where to put it.
> >>> Would there be a better place than in package "o.a.c.rng.simple"?
> >> It was either in simple or examples. But I would like to see it as a
> >> feature in use so I would leave it in simple
> >
> > Sure, it's a useful feature to help users avoid bad usage.
> >
> >>>> RNG-76: Added primitive long constructor to SplitMix64
> >>>>
> >>>> This is something that will effect the benchmark so I'll merge it after.
> >>>> It is a trivial change with a noticeable performance gain by avoiding
> >>>> auto-boxing of the long seed.
> >>> IIRC, there is some developer doc that refer to having a single
> >>> constructor.  But I do not recall the reason.
> >>
> >> I did not know. I cannot find that explained in the project docs. I
> >> looked in site/xdoc/developers.xml and a poor regex search through the rest.
> >
> > I think I meant this[1]:
> > The "provider" must specify one way for setting the seed. For a given
> > seed, the generated sequence must always be the same.
> >
> >> Maybe the reason was to prevent ambiguous construction, e.g. with a
> >> non-parameterised constructor.
> >>
> >> An exception was made for TwoCmres since it needs more arguments. That
> >> is understandable.
> >>
> >> Could this rule be relaxed if the same data passed to different
> >> constructors results in the same generated sequence?
> >
> > I guess so.
> > IIRC, the limitation was due to using commons code in order to
> > instantiate the appropriate RNG (core) class.
> >
> >> E.g. the unit tests
> >> I added for the new XoShiRo generators test this.
> >
> > Thanks for all this work,
> > Gilles
>
> I’m happy to do it. I am just adding to the library from the perspective I have as end user. Hopefully the changes will benefit other users as well as me.
>
> >
> > [1] http://commons.apache.org/proper/commons-rng/commons-rng-core/apidocs/org/apache/commons/rng/core/source32/package-summary.html <http://commons.apache.org/proper/commons-rng/commons-rng-core/apidocs/org/apache/commons/rng/core/source32/package-summary.html>
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Rng] Merge outstanding PRs for construction speed baseline

Posted by Alex Herbert <al...@gmail.com>.

> On 12 Mar 2019, at 16:58, Gilles Sadowski <gi...@gmail.com> wrote:
> 
> [Replying to the ML.]
> 
> Le mar. 12 mars 2019 à 13:00, Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>> a écrit :
>> 
>> 
>> On 11/03/2019 23:54, Gilles Sadowski wrote:
>>> Le ven. 8 mars 2019 à 18:12, Alex Herbert <al...@gmail.com> a écrit :
>>>> Hi,
>>>> 
>>>> I'd like to start on improving the methods in rng-simple to create the
>>>> RandomSource.
>>>> 
>>>> There are a few PRs outstanding but I think the discussion on this list
>>>> was that the new XoShiRo generators are OK. So I would like to merge the
>>>> following PRs:
>>>> 
>>>> RNG-82: Add XorShift1024StarPhi generator
>>> See other thread.
>>> 
>>>> RNG-70: Add new XoShiRo generators
>>> +1
>> 
>> Ah. Most of these generators have more than one constructor (see below
>> on the discussion for the change to the SplitMix64 generator). These are
>> utility constructors given the small size of the seed. e.g. for a seed
>> of length 4:
>> 
>> public XoShiRo256StarStar(long[] seed)
>> 
>> public XoShiRo256StarStar(long seed0, long seed1, long seed2, long seed3)
>> 
>> I did not see this as an issue. I added unit tests to show that the same
>> input data in either the array or the primitives produces the same
>> generator sequence. The primitive constructor does not have to do array
>> checking and filling when the array is not the correct length. The input
>> is directly used to set the state. It is a cleaner constructor, but it
>> does not fit the model required by the ProviderBuilder which would like
>> an object seed.
>> 
>> I am going to rewrite the ProviderBuilder internals to allow the size of
>> the seed to be known and to remove the use of reflection to create a
>> source. In this instance the primitive constructor would be better. That
>> is a motivating reason for the change to the SplitMix64 too.
> 
> I did not look again at the code yet; but if you can work around
> the issue (as I recalled it) without impacting the public API, then
> fine I guess. :-)
> 
>> 
>>> 
>>>> Then use master to set a baseline for the construction speed. This can
>>>> be used to compare against the latest code to see how much the changes
>>>> have improved construction speed. Given the work will have a big effect
>>>> on generators with small state it would be good to have the new
>>>> generators in the baseline.
>>> IIUC, "big effect" only when constructing and throwing quickly very
>>> many objects.
>>> The doc should make it clear that it's not the "nominal" use-case.
>> 
>> Yes. There is not a document for this at the moment anyway (i.e. the
>> results of the construction benchmark). Should a section be added to the
>> user guide?
> 
> Benchmarks can certainly go in the "Performance" section.
> 
>> This can outline that a generator should be picked for its
>> intended purpose: output datatype required (int, double, etc) and number
>> of samples (period). For short lived generators with low samples then
>> construction speed is a factor.
> 
> All good as long as the entry point for application developers to
> create RNGs is "RandomSource".
> 
>> 
>> This is a feature that will probably most benefit the
>> ThreadLocalRandomSource where short-lifetime generators are likely to be
>> required in a multi-threaded environment.
> 
> ???
> IIUC, "ThreadLocalRandomSource" will store the longest-living RNG
> instances!

But only per thread. A new thread will get a new RNG which has to be seeded again. But if you have an application with a few long term threads then the ThreadLocalRandomSource can be used to store and retrieve any RNG of choice without having to write extra code to do that yourself.

> 
>> 
>> However the benchmark shows that once you are generating 1,000,000
>> samples then the construction cost of even the TwoCmres is much less
>> than the sampling time.
>> 
>>> 
>>>> I note that the only tags in git are for releases. So the baseline can
>>>> be just a commit Id and I'll add it to the benchmark Jira RNG-72 for
>>>> reference.
> 
> I'm still wondering about applications that require short-lived
> generators...
> It doesn't seem to ever be a good idea to do that.  There was
> an issue caused by that in [Digester], fixed only in the latest
> release…

Short lived generators for example for array shuffling of small arrays that will be done only once.

ThreadLocalRandomSource solves this by just giving the thread the current one, or a new one if the thread has never used one before. In the later case a low construction cost is good, especially if the thread is also short lived. For a shuffle of 100 items then a SplitMix64 will do a good enough job and can be built very fast. 

For simulations then you should be using longer period generators. If doing a multithreaded simulation I use a new RNG per Runnable put into an ExecutorService. However the Runnable can now use ThreadLocalRandomSource.current(…) to get the RNG. So for a fixed thread pool there potentially can be less generators as they get reused, and the Runnable does not have to be written to do lots of work in a loop with the same RNG.

Parallel simulations is a case where the split() and jump() functionality would be better still to avoid sequence overlap.

> 
>>>> 
>>>> There are also some other PRs to be discussed:
>>>> 
>>>> RNG-81: NumberFactory to sample all rationals between 0 and 1.
>>>> 
>>>> This one changes to the implementation in Open JDK for float and double
>>>> generation. I think this is OK to merge.
>>> +1
>>> [No promise of reproducibility for "derived" types.]
>> OK. I'll put that in.
>>> 
>>>> It won't effect the baseline
>>>> but is good to get it into the next release.
>>>> 
>>>> 
>>>> RNG-78: Added a ThreadLocalRandomSource.
>>>> 
>>>> This works as a proof of concept. But it is all new files and so I'm
>>>> happy to leave that until it is decided exactly if and where to put it.
>>> Would there be a better place than in package "o.a.c.rng.simple"?
>> It was either in simple or examples. But I would like to see it as a
>> feature in use so I would leave it in simple
> 
> Sure, it's a useful feature to help users avoid bad usage.
> 
>>>> RNG-76: Added primitive long constructor to SplitMix64
>>>> 
>>>> This is something that will effect the benchmark so I'll merge it after.
>>>> It is a trivial change with a noticeable performance gain by avoiding
>>>> auto-boxing of the long seed.
>>> IIRC, there is some developer doc that refer to having a single
>>> constructor.  But I do not recall the reason.
>> 
>> I did not know. I cannot find that explained in the project docs. I
>> looked in site/xdoc/developers.xml and a poor regex search through the rest.
> 
> I think I meant this[1]:
> The "provider" must specify one way for setting the seed. For a given
> seed, the generated sequence must always be the same.
> 
>> Maybe the reason was to prevent ambiguous construction, e.g. with a
>> non-parameterised constructor.
>> 
>> An exception was made for TwoCmres since it needs more arguments. That
>> is understandable.
>> 
>> Could this rule be relaxed if the same data passed to different
>> constructors results in the same generated sequence?
> 
> I guess so.
> IIRC, the limitation was due to using commons code in order to
> instantiate the appropriate RNG (core) class.
> 
>> E.g. the unit tests
>> I added for the new XoShiRo generators test this.
> 
> Thanks for all this work,
> Gilles

I’m happy to do it. I am just adding to the library from the perspective I have as end user. Hopefully the changes will benefit other users as well as me.

> 
> [1] http://commons.apache.org/proper/commons-rng/commons-rng-core/apidocs/org/apache/commons/rng/core/source32/package-summary.html <http://commons.apache.org/proper/commons-rng/commons-rng-core/apidocs/org/apache/commons/rng/core/source32/package-summary.html>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org <ma...@commons.apache.org>
> For additional commands, e-mail: dev-help@commons.apache.org <ma...@commons.apache.org>

Re: [Rng] Merge outstanding PRs for construction speed baseline

Posted by Gilles Sadowski <gi...@gmail.com>.
[Replying to the ML.]

Le mar. 12 mars 2019 à 13:00, Alex Herbert <al...@gmail.com> a écrit :
>
>
> On 11/03/2019 23:54, Gilles Sadowski wrote:
> > Le ven. 8 mars 2019 à 18:12, Alex Herbert <al...@gmail.com> a écrit :
> >> Hi,
> >>
> >> I'd like to start on improving the methods in rng-simple to create the
> >> RandomSource.
> >>
> >> There are a few PRs outstanding but I think the discussion on this list
> >> was that the new XoShiRo generators are OK. So I would like to merge the
> >> following PRs:
> >>
> >> RNG-82: Add XorShift1024StarPhi generator
> > See other thread.
> >
> >> RNG-70: Add new XoShiRo generators
> > +1
>
> Ah. Most of these generators have more than one constructor (see below
> on the discussion for the change to the SplitMix64 generator). These are
> utility constructors given the small size of the seed. e.g. for a seed
> of length 4:
>
> public XoShiRo256StarStar(long[] seed)
>
> public XoShiRo256StarStar(long seed0, long seed1, long seed2, long seed3)
>
> I did not see this as an issue. I added unit tests to show that the same
> input data in either the array or the primitives produces the same
> generator sequence. The primitive constructor does not have to do array
> checking and filling when the array is not the correct length. The input
> is directly used to set the state. It is a cleaner constructor, but it
> does not fit the model required by the ProviderBuilder which would like
> an object seed.
>
> I am going to rewrite the ProviderBuilder internals to allow the size of
> the seed to be known and to remove the use of reflection to create a
> source. In this instance the primitive constructor would be better. That
> is a motivating reason for the change to the SplitMix64 too.

I did not look again at the code yet; but if you can work around
the issue (as I recalled it) without impacting the public API, then
fine I guess. :-)

>
> >
> >> Then use master to set a baseline for the construction speed. This can
> >> be used to compare against the latest code to see how much the changes
> >> have improved construction speed. Given the work will have a big effect
> >> on generators with small state it would be good to have the new
> >> generators in the baseline.
> > IIUC, "big effect" only when constructing and throwing quickly very
> > many objects.
> > The doc should make it clear that it's not the "nominal" use-case.
>
> Yes. There is not a document for this at the moment anyway (i.e. the
> results of the construction benchmark). Should a section be added to the
> user guide?

Benchmarks can certainly go in the "Performance" section.

> This can outline that a generator should be picked for its
> intended purpose: output datatype required (int, double, etc) and number
> of samples (period). For short lived generators with low samples then
> construction speed is a factor.

All good as long as the entry point for application developers to
create RNGs is "RandomSource".

>
> This is a feature that will probably most benefit the
> ThreadLocalRandomSource where short-lifetime generators are likely to be
> required in a multi-threaded environment.

???
IIUC, "ThreadLocalRandomSource" will store the longest-living RNG
instances!

>
> However the benchmark shows that once you are generating 1,000,000
> samples then the construction cost of even the TwoCmres is much less
> than the sampling time.
>
> >
> >> I note that the only tags in git are for releases. So the baseline can
> >> be just a commit Id and I'll add it to the benchmark Jira RNG-72 for
> >> reference.

I'm still wondering about applications that require short-lived
generators...
It doesn't seem to ever be a good idea to do that.  There was
an issue caused by that in [Digester], fixed only in the latest
release...

> >>
> >> There are also some other PRs to be discussed:
> >>
> >> RNG-81: NumberFactory to sample all rationals between 0 and 1.
> >>
> >> This one changes to the implementation in Open JDK for float and double
> >> generation. I think this is OK to merge.
> > +1
> > [No promise of reproducibility for "derived" types.]
> OK. I'll put that in.
> >
> >> It won't effect the baseline
> >> but is good to get it into the next release.
> >>
> >>
> >> RNG-78: Added a ThreadLocalRandomSource.
> >>
> >> This works as a proof of concept. But it is all new files and so I'm
> >> happy to leave that until it is decided exactly if and where to put it.
> > Would there be a better place than in package "o.a.c.rng.simple"?
> It was either in simple or examples. But I would like to see it as a
> feature in use so I would leave it in simple

Sure, it's a useful feature to help users avoid bad usage.

> >> RNG-76: Added primitive long constructor to SplitMix64
> >>
> >> This is something that will effect the benchmark so I'll merge it after.
> >> It is a trivial change with a noticeable performance gain by avoiding
> >> auto-boxing of the long seed.
> > IIRC, there is some developer doc that refer to having a single
> > constructor.  But I do not recall the reason.
>
> I did not know. I cannot find that explained in the project docs. I
> looked in site/xdoc/developers.xml and a poor regex search through the rest.

I think I meant this[1]:
The "provider" must specify one way for setting the seed. For a given
seed, the generated sequence must always be the same.

> Maybe the reason was to prevent ambiguous construction, e.g. with a
> non-parameterised constructor.
>
> An exception was made for TwoCmres since it needs more arguments. That
> is understandable.
>
> Could this rule be relaxed if the same data passed to different
> constructors results in the same generated sequence?

I guess so.
IIRC, the limitation was due to using commons code in order to
instantiate the appropriate RNG (core) class.

> E.g. the unit tests
> I added for the new XoShiRo generators test this.

Thanks for all this work,
Gilles

[1] http://commons.apache.org/proper/commons-rng/commons-rng-core/apidocs/org/apache/commons/rng/core/source32/package-summary.html

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Rng] Merge outstanding PRs for construction speed baseline

Posted by Gilles Sadowski <gi...@gmail.com>.
Le ven. 8 mars 2019 à 18:12, Alex Herbert <al...@gmail.com> a écrit :
>
> Hi,
>
> I'd like to start on improving the methods in rng-simple to create the
> RandomSource.
>
> There are a few PRs outstanding but I think the discussion on this list
> was that the new XoShiRo generators are OK. So I would like to merge the
> following PRs:
>
> RNG-82: Add XorShift1024StarPhi generator

See other thread.

> RNG-70: Add new XoShiRo generators

+1

>
> Then use master to set a baseline for the construction speed. This can
> be used to compare against the latest code to see how much the changes
> have improved construction speed. Given the work will have a big effect
> on generators with small state it would be good to have the new
> generators in the baseline.

IIUC, "big effect" only when constructing and throwing quickly very
many objects.
The doc should make it clear that it's not the "nominal" use-case.

>
> I note that the only tags in git are for releases. So the baseline can
> be just a commit Id and I'll add it to the benchmark Jira RNG-72 for
> reference.
>
> There are also some other PRs to be discussed:
>
> RNG-81: NumberFactory to sample all rationals between 0 and 1.
>
> This one changes to the implementation in Open JDK for float and double
> generation. I think this is OK to merge.

+1
[No promise of reproducibility for "derived" types.]

> It won't effect the baseline
> but is good to get it into the next release.
>
>
> RNG-78: Added a ThreadLocalRandomSource.
>
> This works as a proof of concept. But it is all new files and so I'm
> happy to leave that until it is decided exactly if and where to put it.

Would there be a better place than in package "o.a.c.rng.simple"?

>
> RNG-76: Added primitive long constructor to SplitMix64
>
> This is something that will effect the benchmark so I'll merge it after.
> It is a trivial change with a noticeable performance gain by avoiding
> auto-boxing of the long seed.

IIRC, there is some developer doc that refer to having a single
constructor.  But I do not recall the reason.

Best,
Gilles

>
>
> Regards,
>
> Alex
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org