You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex D Herbert (JIRA)" <ji...@apache.org> on 2019/03/01 10:47:00 UTC

[jira] [Issue Comment Deleted] (RNG-72) Create a RandomSource.create benchmark

     [ https://issues.apache.org/jira/browse/RNG-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex D Herbert updated RNG-72:
------------------------------
    Comment: was deleted

(was: I have compiled a new listing using the output of {{ConstructionPerformance}} and {{GenerationPerformance}}.

For reference the construction benchmark creates 500 RNGs, using pre-built seeds where appropriate. Test the following construction methods:
 * Create using the native constructor with the *new* keyword
 * Create using a cached *{{Constructor<Object>}}* with *{{newInstance}}*
 * Create using a cached *{{Class<?>}}*, lookup the *{{Constructor<Object>}}* and use *{{newInstance}}*
 * Create using *{{RandomSource}}* with the *native* seed
 * Create using *{{RandomSource}}* with a *{{null}}* seed
 * Create using *{{RandomSource}}* with a *{{Long}}* seed
 * Create using *{{RandomSource}}* with a *truncated native* seed forcing self seeding
 * Create using *{{RandomSource}}* with a *{{byte[]}}* of the appropriate length

Generation time is shown for the given number of values. The time uses the native method of the generator (either {{nextInt}} or {{nextLong}}) so *is not comparable between generators of different types*. It is indicative of the speed it will take to create a generator and output values.

The table is constructed by adding the time for construction to the generation time. Only the total time is shown for the construction using *new*. All other times are relative to this. The table is sorted by speed of the *new* construction and grouped by the number of values generated.

 

*Note: This listing involves adding of times from two benchmarks run on the same computer, using the same JMH benchmark settings for average time in microseconds. An independent test using construction and generation together should be conducted to indicate this is a valid approach.*

 

 
||Values||Generation||RandomSource||new||newInstance||lookupNewInstance||createNativeSeed||createNullSeed||createLongSeed||createSelfSeed||createByteArray||
|0|0|SPLIT_MIX_64|4.55|1.89|5.63|8.62|13.48|9.11|0|12.11|
|0|0|KISS|10.66|1.37|2.82|4.07|67.38|10.79|4.53|6.33|
|0|0|WELL_512_A|12.68|1.36|2.59|4.14|64.37|14.00|4.88|10.26|
|0|0|WELL_1024_A|17.31|1.36|2.37|3.39|45.20|8.31|5.66|13.54|
|0|0|JDK|17.49|0.79|1.93|2.34|4.54|2.82|0|3.68|
|0|0|XOR_SHIFT_1024_S|19.81|1.28|2.10|2.60|75.25|12.26|3.55|7.24|
|0|0|MWC_256|234.62|0.97|1.28|1.18|4.50|2.09|2.37|6.70|
|0|0|WELL_19937_C|247.23|1.10|0.99|1.20|4.58|2.29|4.51|14.01|
|0|0|WELL_19937_A|267.32|0.91|1.17|1.11|4.89|2.13|4.65|12.97|
|0|0|WELL_44497_B|550.72|1.16|1.03|0.97|2.93|2.41|4.50|16.66|
|0|0|WELL_44497_A|594.5|0.98|0.89|1.06|3.06|1.87|5.32|12.29|
|0|0|ISAAC|1230.51|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|0|0|MT_64|3568.38|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|0|0|MT|5501.32|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|0|0|TWO_CMRES|66868.49|1.01|0.95|1.12|1.10|1.09|0|1.08|
|1|0.01|SPLIT_MIX_64|4.56|1.88|5.62|8.60|13.45|9.10|0|12.09|
|1|0.01|KISS|10.67|1.37|2.81|4.07|67.32|10.78|4.53|6.33|
|1|0.01|WELL_512_A|12.69|1.36|2.58|4.14|64.32|13.99|4.88|10.25|
|1|0.01|WELL_1024_A|17.32|1.36|2.37|3.39|45.17|8.31|5.66|13.54|
|1|0.02|JDK|17.51|0.79|1.93|2.34|4.53|2.82|0|3.68|
|1|0.01|XOR_SHIFT_1024_S|19.82|1.28|2.10|2.60|75.21|12.25|3.55|7.24|
|1|0.01|MWC_256|234.63|0.97|1.28|1.18|4.50|2.09|2.37|6.70|
|1|0.01|WELL_19937_C|247.24|1.10|0.99|1.20|4.58|2.29|4.51|14.01|
|1|0.01|WELL_19937_A|267.33|0.91|1.17|1.11|4.89|2.13|4.65|12.97|
|1|0.01|WELL_44497_B|550.73|1.16|1.03|0.97|2.93|2.41|4.50|16.66|
|1|0.01|WELL_44497_A|594.51|0.98|0.89|1.06|3.06|1.87|5.32|12.29|
|1|0.01|ISAAC|1230.52|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|1|0.01|MT_64|3568.39|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|1|0.01|MT|5501.33|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|1|0.01|TWO_CMRES|66868.5|1.01|0.95|1.12|1.10|1.09|0|1.08|
|100|0.44|SPLIT_MIX_64|4.99|1.81|5.22|7.95|12.38|8.40|0|11.13|
|100|0.61|KISS|11.27|1.35|2.72|3.91|63.79|10.26|4.34|6.04|
|100|0.79|WELL_512_A|13.47|1.34|2.49|3.96|60.65|13.24|4.65|9.72|
|100|0.86|WELL_1024_A|18.17|1.34|2.31|3.28|43.11|7.97|5.44|12.95|
|100|1.91|JDK|19.4|0.81|1.84|2.21|4.19|2.64|0|3.42|
|100|0.54|XOR_SHIFT_1024_S|20.35|1.27|2.07|2.56|73.28|11.96|3.48|7.08|
|100|0.45|MWC_256|235.07|0.97|1.28|1.18|4.49|2.09|2.37|6.69|
|100|1.06|WELL_19937_C|248.29|1.09|0.99|1.20|4.57|2.28|4.50|13.95|
|100|0.98|WELL_19937_A|268.3|0.91|1.17|1.11|4.87|2.12|4.64|12.93|
|100|1.09|WELL_44497_B|551.81|1.16|1.03|0.97|2.93|2.41|4.49|16.63|
|100|1.05|WELL_44497_A|595.55|0.98|0.89|1.06|3.06|1.86|5.31|12.27|
|100|0.62|ISAAC|1231.13|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|100|0.67|MT_64|3569.05|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|100|0.66|MT|5501.98|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|100|0.54|TWO_CMRES|66869.03|1.01|0.95|1.12|1.10|1.09|0|1.08|
|10000|42.74|SPLIT_MIX_64|47.29|1.09|1.45|1.73|2.20|1.78|0|2.07|
|10000|64.41|KISS|75.07|1.05|1.26|1.44|10.43|2.39|1.50|1.76|
|10000|74.64|WELL_512_A|87.32|1.05|1.23|1.46|10.20|2.89|1.56|2.34|
|10000|76.65|WELL_1024_A|93.96|1.07|1.25|1.44|9.14|2.35|1.86|3.31|
|10000|225.95|JDK|243.44|0.99|1.07|1.10|1.25|1.13|0|1.19|
|10000|55.29|XOR_SHIFT_1024_S|75.1|1.07|1.29|1.42|20.59|3.97|1.67|2.65|
|10000|43.09|MWC_256|277.71|0.98|1.24|1.16|3.96|1.92|2.16|5.82|
|10000|105.91|WELL_19937_C|353.14|1.07|0.99|1.14|3.51|1.90|3.46|10.11|
|10000|100.37|WELL_19937_A|367.69|0.94|1.12|1.08|3.83|1.82|3.66|9.70|
|10000|111.56|WELL_44497_B|662.28|1.14|1.02|0.97|2.60|2.18|3.91|14.02|
|10000|104.18|WELL_44497_A|698.68|0.98|0.91|1.05|2.76|1.74|4.68|10.61|
|10000|61.19|ISAAC|1291.7|0.97|0.92|0.86|1.42|1.03|1.18|2.23|
|10000|66.40|MT_64|3634.78|1.05|0.95|1.00|0.79|0.55|0.47|0.98|
|10000|64.42|MT|5565.74|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|10000|49.90|TWO_CMRES|66918.39|1.01|0.95|1.12|1.10|1.09|0|1.08|
|1000000|3984.56|SPLIT_MIX_64|3989.11|1.00|1.01|1.01|1.01|1.01|0|1.01|
|1000000|6032.31|KISS|6042.97|1.00|1.00|1.01|1.12|1.02|1.01|1.01|
|1000000|7356.91|WELL_512_A|7369.59|1.00|1.00|1.01|1.11|1.02|1.01|1.02|
|1000000|7444.91|WELL_1024_A|7462.22|1.00|1.00|1.01|1.10|1.02|1.01|1.03|
|1000000|19155.09|JDK|19172.58|1.00|1.00|1.00|1.00|1.00|0|1.00|
|1000000|4889.82|XOR_SHIFT_1024_S|4909.63|1.00|1.00|1.01|1.30|1.05|1.01|1.03|
|1000000|4133.11|MWC_256|4367.73|1.00|1.02|1.01|1.19|1.06|1.07|1.31|
|1000000|10572.00|WELL_19937_C|10819.23|1.00|1.00|1.00|1.08|1.03|1.08|1.30|
|1000000|10005.49|WELL_19937_A|10272.81|1.00|1.00|1.00|1.10|1.03|1.10|1.31|
|1000000|12121.27|WELL_44497_B|12671.99|1.01|1.00|1.00|1.08|1.06|1.15|1.68|
|1000000|11065.87|WELL_44497_A|11660.37|1.00|0.99|1.00|1.11|1.04|1.22|1.58|
|1000000|5976.62|ISAAC|7207.13|0.99|0.99|0.98|1.08|1.00|1.03|1.22|
|1000000|6190.41|MT_64|9758.79|1.02|0.98|1.00|0.92|0.83|0.80|0.99|
|1000000|6635.87|MT|12137.19|1.00|1.00|1.03|0.91|0.86|0.83|1.17|
|1000000|4580.57|TWO_CMRES|71449.06|1.01|0.96|1.11|1.10|1.08|0|1.07|

The effect of construction and generation can be more easily observed using a table of the relative contribution of the generation time to the total time. Here the total time is made using construction with the *new* keyword or via the default *{{RandomSource.create}}* method with a *native* seed or *null* seed. Note: Times for the *create* method are relative to *new* (as above) but the generating % is computed using the total time.
||Values||Generation||RandomSource||new||Generating (%)||createNativeSeed||Generating (%)||createNullSeed||Generating (%)||
|0|0|SPLIT_MIX_64|4.55|0.00|8.62|0.00|13.48|0.00|
|0|0|KISS|10.66|0.00|4.07|0.00|67.38|0.00|
|0|0|WELL_512_A|12.68|0.00|4.14|0.00|64.37|0.00|
|0|0|WELL_1024_A|17.31|0.00|3.39|0.00|45.2|0.00|
|0|0|JDK|17.49|0.00|2.34|0.00|4.54|0.00|
|0|0|XOR_SHIFT_1024_S|19.81|0.00|2.6|0.00|75.25|0.00|
|0|0|MWC_256|234.62|0.00|1.18|0.00|4.5|0.00|
|0|0|WELL_19937_C|247.23|0.00|1.2|0.00|4.58|0.00|
|0|0|WELL_19937_A|267.32|0.00|1.11|0.00|4.89|0.00|
|0|0|WELL_44497_B|550.72|0.00|0.97|0.00|2.93|0.00|
|0|0|WELL_44497_A|594.5|0.00|1.06|0.00|3.06|0.00|
|0|0|ISAAC|1230.51|0.00|0.86|0.00|1.44|0.00|
|0|0|MT_64|3568.38|0.00|1|0.00|0.79|0.00|
|0|0|MT|5501.32|0.00|1.06|0.00|0.8|0.00|
|0|0|TWO_CMRES|66868.49|0.00|1.12|0.00|1.1|0.00|
|1|0.01|SPLIT_MIX_64|4.56|0.22|8.6|0.03|13.45|0.02|
|1|0.01|KISS|10.67|0.09|4.07|0.02|67.32|0.00|
|1|0.01|WELL_512_A|12.69|0.08|4.14|0.02|64.32|0.00|
|1|0.01|WELL_1024_A|17.32|0.06|3.39|0.02|45.17|0.00|
|1|0.02|JDK|17.51|0.11|2.34|0.05|4.53|0.03|
|1|0.01|XOR_SHIFT_1024_S|19.82|0.05|2.6|0.02|75.21|0.00|
|1|0.01|MWC_256|234.63|0.00|1.18|0.00|4.5|0.00|
|1|0.01|WELL_19937_C|247.24|0.00|1.2|0.00|4.58|0.00|
|1|0.01|WELL_19937_A|267.33|0.00|1.11|0.00|4.89|0.00|
|1|0.01|WELL_44497_B|550.73|0.00|0.97|0.00|2.93|0.00|
|1|0.01|WELL_44497_A|594.51|0.00|1.06|0.00|3.06|0.00|
|1|0.01|ISAAC|1230.52|0.00|0.86|0.00|1.44|0.00|
|1|0.01|MT_64|3568.39|0.00|1|0.00|0.79|0.00|
|1|0.01|MT|5501.33|0.00|1.06|0.00|0.8|0.00|
|1|0.01|TWO_CMRES|66868.5|0.00|1.12|0.00|1.1|0.00|
|100|0.44|SPLIT_MIX_64|4.99|8.82|7.95|1.11|12.38|0.71|
|100|0.61|KISS|11.27|5.41|3.91|1.38|63.79|0.08|
|100|0.79|WELL_512_A|13.47|5.86|3.96|1.48|60.65|0.10|
|100|0.86|WELL_1024_A|18.17|4.73|3.28|1.44|43.11|0.11|
|100|1.91|JDK|19.4|9.85|2.21|4.45|4.19|2.35|
|100|0.54|XOR_SHIFT_1024_S|20.35|2.65|2.56|1.04|73.28|0.04|
|100|0.45|MWC_256|235.07|0.19|1.18|0.16|4.49|0.04|
|100|1.06|WELL_19937_C|248.29|0.43|1.2|0.36|4.57|0.09|
|100|0.98|WELL_19937_A|268.3|0.37|1.11|0.33|4.87|0.08|
|100|1.09|WELL_44497_B|551.81|0.20|0.97|0.20|2.93|0.07|
|100|1.05|WELL_44497_A|595.55|0.18|1.06|0.17|3.06|0.06|
|100|0.62|ISAAC|1231.13|0.05|0.86|0.06|1.44|0.03|
|100|0.67|MT_64|3569.05|0.02|1|0.02|0.79|0.02|
|100|0.66|MT|5501.98|0.01|1.06|0.01|0.8|0.01|
|100|0.54|TWO_CMRES|66869.03|0.00|1.12|0.00|1.1|0.00|
|10000|42.74|SPLIT_MIX_64|47.29|90.38|1.73|52.24|2.2|41.08|
|10000|64.41|KISS|75.07|85.80|1.44|59.58|10.43|8.23|
|10000|74.64|WELL_512_A|87.32|85.48|1.46|58.55|10.2|8.38|
|10000|76.65|WELL_1024_A|93.96|81.58|1.44|56.65|9.14|8.93|
|10000|225.95|JDK|243.44|92.82|1.1|84.38|1.25|74.25|
|10000|55.29|XOR_SHIFT_1024_S|75.1|73.62|1.42|51.85|20.59|3.58|
|10000|43.09|MWC_256|277.71|15.52|1.16|13.38|3.96|3.92|
|10000|105.91|WELL_19937_C|353.14|29.99|1.14|26.31|3.51|8.54|
|10000|100.37|WELL_19937_A|367.69|27.30|1.08|25.28|3.83|7.13|
|10000|111.56|WELL_44497_B|662.28|16.84|0.97|17.37|2.6|6.48|
|10000|104.18|WELL_44497_A|698.68|14.91|1.05|14.20|2.76|5.40|
|10000|61.19|ISAAC|1291.7|4.74|0.86|5.51|1.42|3.34|
|10000|66.4|MT_64|3634.78|1.83|1|1.83|0.79|2.31|
|10000|64.42|MT|5565.74|1.16|1.06|1.09|0.8|1.45|
|10000|49.9|TWO_CMRES|66918.39|0.07|1.12|0.07|1.1|0.07|
|1000000|3984.56|SPLIT_MIX_64|3989.11|99.89|1.01|98.90|1.01|98.90|
|1000000|6032.31|KISS|6042.97|99.82|1.01|98.84|1.12|89.13|
|1000000|7356.91|WELL_512_A|7369.59|99.83|1.01|98.84|1.11|89.94|
|1000000|7444.91|WELL_1024_A|7462.22|99.77|1.01|98.78|1.1|90.70|
|1000000|19155.09|JDK|19172.58|99.91|1|99.91|1|99.91|
|1000000|4889.82|XOR_SHIFT_1024_S|4909.63|99.60|1.01|98.61|1.3|76.61|
|1000000|4133.11|MWC_256|4367.73|94.63|1.01|93.69|1.19|79.52|
|1000000|10572|WELL_19937_C|10819.23|97.71|1|97.71|1.08|90.48|
|1000000|10005.49|WELL_19937_A|10272.81|97.40|1|97.40|1.1|88.54|
|1000000|12121.27|WELL_44497_B|12671.99|95.65|1|95.65|1.08|88.57|
|1000000|11065.87|WELL_44497_A|11660.37|94.90|1|94.90|1.11|85.50|
|1000000|5976.62|ISAAC|7207.13|82.93|0.98|84.62|1.08|76.78|
|1000000|6190.41|MT_64|9758.79|63.43|1|63.43|0.92|68.95|
|1000000|6635.87|MT|12137.19|54.67|1.03|53.08|0.91|60.08|
|1000000|4580.57|TWO_CMRES|71449.06|6.41|1.11|5.78|1.1|5.83|

 

Creating 0, 1, or 100 values.

The construction time is the dominant component of the total time. In this use case it will be valid to optimise construction speed.

The construction without a seed highlights the slow speed of the seed generation by the factory method. To generate 100 values using a SplitMix64 without a seed takes over 99% of the time constructing the generator.

 

Creating 10000 values.

For the fast to construct generators the generation time is the major component of the run-time. All of these generators have either a primitive seed or a seed array with a size less than 32. The difference of using the {{RandomSource.create}} is noticeable and optimisation of this will be beneficial.

Compared to the above generating 10000 values using a SplitMix64 without a seed takes 59% of the time constructing the generator, with a seed takes 48% of the time but a call to *new* would take 10% of the time constructing the generator. Using reflection to create the object is slow.

The slow to construct generators (MT, MT_64, TWO_CMRES) have internal seeding routines that dominate total time.

 

Creating 1000000 values.

Only the slow to construct generators (MT, MT_64, TWO_CMRES) with internal seeding routines part of the algorithm are still dominated by construction cost.

 

Improvements ideas:
* Generators with an array seed construct a new array (zero filled) and call the method to fill the array using the input seed. In the case where the input seed is the correct size then seed filling can be skipped and the seed copied using a call to {{Arrays.copyOf}} which should be faster as it is JVM intrinsic and avoids zero fill.
* The MWC_256 is interesting as it has a seed size of 257. However the construction creates a temp array which is filled and then later copied to the state which has a size of 256. If the input seed is the correct size then this allocation for seed filling is redundant. This provides an optimisation.

)

> Create a RandomSource.create benchmark
> --------------------------------------
>
>                 Key: RNG-72
>                 URL: https://issues.apache.org/jira/browse/RNG-72
>             Project: Commons RNG
>          Issue Type: New Feature
>          Components: simple
>    Affects Versions: 1.3
>            Reporter: Alex D Herbert
>            Assignee: Alex D Herbert
>            Priority: Minor
>              Labels: performance-benchmark
>             Fix For: 1.3
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The recommended method to construct a {{UniformRandomProvider}} is to use, e.g.:
> {code:java}
> import org.apache.commons.rng.UniformRandomProvider;
> import org.apache.commons.rng.simple.RandomSource;
> UniformRandomProvider rng = RandomSource.create(RandomSource.MWC_256);
> {code}
> The factory method knows the type of seed required for the constructor and generates one as appropriate.
> This factory method could be made more efficient, in particular:
>  * Reducing synchronisation around the single source of random seed data
>  * Adding knowledge of the required seed size for arrays
>  * Changing internal data structures, e.g. {{Map<Class<?>, SeedConverter<?,?>>}} can be changed to {{Map<SeedType, SeedConverter<?,?>>}} using an {{EnumMap}} if a new enum {{SeedType}} was created for all the supported seeds (currently 4 types).
>  * Add a new interface to replace {{SeedConverter<?,?>.convert()}} with a {{.convert(int outputArraySize)}} method to allow conversions to generate appropriately sized arrays. The parameter can be ignored for non-array conversions but could optimise array conversions.
> This ticket is to add a JMH benchmark to compare the speed of construction of all the providers using:
>  * Their native constructor
>  * {{RandomSource}} using the native seed of the correct size (calls a constructor using reflection)
>  * {{RandomSource}} using a non native seed (requires seed conversion)
>  * {{RandomSource}} using no seed (requires seed generation)
> The report will be posted here. It could be added to the user guide for reference.
> This work is motivated by the new {{XorShiRo}} generators in version 1.3 that have a native array seed size of 2, 4, or 8. The current {{RandomSource}} create method will generate a fixed seed of length 128 for seeding.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)