You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles <gi...@harfang.homelinux.org> on 2016/09/15 14:37:10 UTC

[RNG] About RNG-6 (Was: [VOTE][RC1] Release Commons Rng 1.0)

Hi.

Moving to another thread, as your comments do not belong in
a release vote.

Many of them could lead to interesting discussions that should
(and could) have happened earlier.

Indeed, for the record, see
   https://issues.apache.org/jira/browse/RNG-6
and the numerous posts which I sent to this list
  * while ironing out the code of this new component in its own
    repository, and doing the various chores (site, userguide,
    quality testing, Github, Travis, Coveralls, etc) for the last
    6 weeks,
  * while developing that code in a new Commons Math package
    "o.a.c.math4.rng" for the last 6 months,
  * while fixing the code in Commons Math "o.a.c.m.random"
    package for the last 9 months.

For a number of issues which you now raise, I've explicitly
asked for input.

I find it a bit strange that, at this precise point, you start
to question the choices which I've made in the absence of any
prior remarks, suggestions, or warnings.

On Wed, 14 Sep 2016 17:04:53 +0200, Emmanuel Bourg wrote:
> Hi all,
>
> RNG is moving fast, that's nice. I got a quick look at the API and 
> the
> site, here are my comments:
>
> - The main page (Overview) could be enhanced with a description of 
> the
> API longer than a single sentence. It could mention the motivations 
> for
> the API, the implementations supported, a sample code snippet and a 
> link
> to the user guide.

I don't understand "motivations for the API".

Do you mean something like my comment in
   https://issues.apache.org/jira/browse/RNG-6
?

[My motivation for proposing this code can be gathered from
the archives and the bugs and shortcomings reported in the
JIRA MATH project, and related posts here). They do not
belong in that "overview" page.]

> - In the Performance section of the User Guide, I'd suggest grouping 
> the
> 3 tables into a single one. It might also be interesting to put the
> quality results in the same table, so we can see the trade-off 
> between
> performance and quality.

I won't do that.
It was already a pain to create those tables as they are (see the
doc source).

> - The License section of the user guide could be removed, I guess
> everyone knows that Apache projects use the Apache license.

No problem.
I thought that it was mandatory.

All the pages were copied from the corresponding Commons Math ones.

> - The internal packages should be hidden from the Javadoc,

I do not know how do that.

And I don't see the point, since from the "binary" POV, they
are "public".

> otherwise
> users may be tempted to directly use the classes there.

You must have seen that there are prominent warnings against
doing that.

Users will do what they want anyways.
What is not supported is not supported.  We agreed here that it
was enough to tag a class as "internal" in order to fulfill our
obligation of least surprise.

> This would imply
> copying the documentation of the generators into the RandomSource 
> class.

I do not agree.

Users can select a generator without knowing the gory details.
If they want to know more, they go to the internals.
It they really want to know, they go to the references.

Internal should not be hidden: contributors who'd fancy to provide
additional generators do not have to go through any hoops.

Or do you suggest that this component should supply two JARs:
   * commons-rng-api
      Contents of "o.a.c.rng"
   * commons-rng-core
      Contents of "o.a.c.rng.internal"
?

And two sets of Javadoc?

> - Why isn't the UniformRandomProvider interface simply named
> RandomProvider?

Maybe because we want to be transparent on what the code does (?).

With this name, there is no ambiguity: no more, no less than what
the name implies.

> Is there any plan to implement non uniform random
> generators in the future? Will they have a different interface?

This point has been discussed on this list (although, again, there
was a severe lack of feedback...).

Executive summary:
   No strong coupling between generators (of uniform sequence of
   pseudo-random numbers) and utilities on top of those generators
   (e.g. generators producing non-uniform sequences).
   Utilities should go in another component/package/module (see
   e.g. Commons Math's packages "o.a.c.math4.distribution" and
   "o.a.c.math4.random" for candidate code).

> - UniformRandomProvider mimics the java.util.Random methods which is
> nice. Is there any value in also implementing the nextGaussian() 
> method
> such that our API is a perfect drop-in replacement for the JDK class?

No. [IMHO.]
This was discussed on this list (and cf. above point).

> - For code relying on the java.util.Random type, it might be useful 
> to
> provide an adapter turning an UniformRandomProvider into a
> java.util.Random implementation (Hipparchus has one).

This was discussed on this list.

And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
"o.a.c.math4.random.RngAdaptor" classes.

In the latter (being deprecated as of its creation), I indicated why
it is a bad idea.

Again, if we really want to support the use-case you proposed, it's
a utility that should go in another component.

> - The choice of an enum for the factory class RandomSource is a bit
> unusual. It is probably ok for simple RNG with a simple seed, but 
> with
> more complex implementations I find it less intuitive to use.

Less intuitive than what?

> For example the create(RandomSource, Object Object...) method has 
> this
> snippet in its documentation:
>
>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, 26219, 6, 9)
>
> When a user types this in its IDE, there is no type information nor
> contextual documentation for the parameters expected. The user has to
> know the type of the seed and the meaning of the parameters expected,
> the IDE won't be able to help here (with CTRL+P and CTRL+Q in 
> IntelliJ).

Honestly, I don't care.
To put it less bluntly, the choice of RNG to instantiate is not a
matter of IDE.

Practically, the lack of type information (which I don't like either)
was a trade-off to have a dead-simple API.

As a user, one doesn't need an intelligent IDE to know all there is
to know about Commons RNG, and it will take less than 10 minutes!

> This design induces some complexity, the loss of the seed type leads 
> to
> the seed conversion mechanism (most of the util package if I 
> understand
> well).
>
> I'd feel more comfortable with a more classic factory design, 
> something
> like:
>
>   RandomSource.createTwoCmres()
>   RandomSource.createTwoCmres(Integer seed)
>   RandomSource.createTwoCmres(Integer seed, int i, int j)
>
> or a shorter form like:
>
>   RNG.newTwoCmres()
>   RNG.newTwoCmres(int seed)
>   RNG.newTwoCmres(Integer seed, int i, int j)
>
> That will add more methods to the factory, but it'll be much more 
> usable
> in my opinion.

Not in mine.

This design came after 9 months of sustained work.
Nothing to be proud of in terms of LOC/day...
But what you propose to drop is the only non trivial part of
this work.

As I said above, this is a trade-off, and it looks like a fault
only because Java is strongly typed.

The automatic seed conversion is a huge advantage for casual
users; they can still seed them with a single "long" (and under
the hood get a good "native" seed).
But the doc is there to warn them that RNGs usually have a larger
state.

> - I'm not convinced by the need for the convenience static methods
> RandomSource.createInt/Long. If the user doesn't want to specify the
> seed, he could simply use the seed-less create method. And if the 
> random
> generator requires extra parameters, he could use a null value for 
> the
> seed as a way to mean "generate one for me please":
>
>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, null, 6, 9)

These methods take care of a common use-case, where the actual
seed does not matter (hence the caller might like to not have to
code similar methods), but the _same_ seed must be used in another
execution of the same application to ensure reproducibility (e.g.
for bug tracking).

See also:
   
https://docs.oracle.com/javase/7/docs/api/java/security/SecureRandom.html#generateSeed%28int%29

> - RandomSource.saveState/restoreState allows one to convert a random
> generator to/from a byte array. This looks a lot like Java
> serialization. Maybe the implementations supporting this feature 
> could
> be marked as Serializable, thus it would be possible to call
> RandomSource.saveState only if the instance is Serializable instead 
> of
> catching the UnsupportedOperationException thrown when the feature 
> isn't
> supported.

No, no, no.
Avoiding the "Serializable" clutter was a _requirement_.

Serialization is not part of this component, but is supported by all
implementations.

It's a feature that users can _choose_ the persistence framework
they want and not be forced to consider the pitfalls of "Serializable".
[See "Effective Java".]

It's also a feature that this component will never have to deal
with CVE related to exploitation of something "Serializable" that
should not have been...

> - Why is the byte[] state encapsulated into the RandomSource.State
> class? Why not using a byte array directly?

Because encapsulation is the hallmark of good OO programming?

If some application only needs to save/restore without actual
persistence, it can do without dealing with the "ugliness" of
"byte[]".
And it will continue to work even if the internal representation
is changed.

Exposing the internals was unavoidable for the serialization
use-cases.

>
> So far I'm +1 for a beta release aimed at getting more feedback on 
> the
> API,

There were a lot of discussions here about what "beta release"
means and how to implement it in practice (package naming,
versioning scheme).

They never led anywhere.

Snapshot JARs have been available since the Jenkins task has been
up and running.

I've asked for feedback on this list, on the site since it's been
up, and on "helpwanted.apache.org".

I seriously doubt that having code tagged "beta" would be more
successful at getting users to provide feedback.

It will more likely turn them away, thinking that the generators
might be buggy.  Which is not the case.

As much as I advocated it in order to have more timely releases
of Commons Math, I think that in this case, a "beta" release
(if such a thing existed in Commons) is nothing short of
counter-productive.

The API is what it is.  As for any other component; and, quite
frankly, in proportion of the lines of code, much more thought
was put into it than for anything in Commons Math, and probably
many other Commons components).

I do not deny that it might be wrong (or I wouldn't have filed
"RNG-6") but at this point it is usage that should demonstrate
it.  Not doubt based on... I don't know what.

> but I'm -1 for releasing the current code as 1.0.

Any technical reasons?


Gilles
[Truly sorry that this discussion did not happen at the proper
time. And hoping that constructive feedback like this one will
come when I start talking about "commons-rng-tools".]


>
> Emmanuel Bourg
>
>
> Le 11/09/2016 � 16:55, Gilles a �crit :
>> Hi.
>>
>> This is a [VOTE] for releasing Apache Commons Rng 1.0 (from RC1).
>>
>> Tag name:
>>   RNG_1_0_RC1 (signature can be checked from git using 'git tag -v')
>>
>> Tag URL:
>>
>> 
>> https://git-wip-us.apache.org/repos/asf?p=commons-rng.git;a=commit;h=f8d23082607b9f2c7be7f489eb09627722440ee5
>>
>>
>> Commit ID the tag points at:
>>   f8d23082607b9f2c7be7f489eb09627722440ee5
>>
>> Site:
>>   http://home.apache.org/~erans/commons-rng-1.0-RC1-site
>>
>> Distribution files:
>>   https://dist.apache.org/repos/dist/dev/commons/rng/
>>
>> Distribution files hashes (SHA1):
>>   a221e862c8ff970a9ca3e7fbd86c3200d1f8780a 
>> commons-rng-1.0-bin.tar.gz
>>   689b2bfbdb1856d4f47851d75762aab42057805a commons-rng-1.0-bin.zip
>>   40b7b1639eedf91b5fad5d38e6ebec01e659048f 
>> commons-rng-1.0-src.tar.gz
>>   6296dbabde10169d6365bda99f2af6dcc191e515 commons-rng-1.0-src.zip
>>
>> KEYS file to check signatures:
>>   http://www.apache.org/dist/commons/KEYS
>>
>> Maven artifacts:
>>
>> 
>> https://repository.apache.org/content/repositories/orgapachecommons-1199/org/apache/commons/commons-rng/1.0/
>>
>>
>> [ ] +1 Release it.
>> [ ] +0 Go ahead; I don't care.
>> [ ] -0 There are a few minor glitches: ...
>> [ ] -1 No, do not release it because ...
>>
>> This vote will close in 72 hours, at 2016-09-14T15:10:00Z (this is 
>> UTC
>> time).
>> ----------
>>
>> Thanks,
>> Gilles
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [RNG] About RNG-6 (Was: [VOTE][RC1] Release Commons Rng 1.0)

Posted by Gary Gregory <ga...@gmail.com>.
After scanning this, I am wondering if a smoother path to 1.0 would be
really be through a beta or alpha release. The code is all there, any one
can look at it.

The advantage of a beta release is that it gets pushed out as a ANNOUNCE
email and might get more notice. YMMV.

Gary

On Thu, Sep 15, 2016 at 7:37 AM, Gilles <gi...@harfang.homelinux.org>
wrote:

> Hi.
>
> Moving to another thread, as your comments do not belong in
> a release vote.
>
> Many of them could lead to interesting discussions that should
> (and could) have happened earlier.
>
> Indeed, for the record, see
>   https://issues.apache.org/jira/browse/RNG-6
> and the numerous posts which I sent to this list
>  * while ironing out the code of this new component in its own
>    repository, and doing the various chores (site, userguide,
>    quality testing, Github, Travis, Coveralls, etc) for the last
>    6 weeks,
>  * while developing that code in a new Commons Math package
>    "o.a.c.math4.rng" for the last 6 months,
>  * while fixing the code in Commons Math "o.a.c.m.random"
>    package for the last 9 months.
>
> For a number of issues which you now raise, I've explicitly
> asked for input.
>
> I find it a bit strange that, at this precise point, you start
> to question the choices which I've made in the absence of any
> prior remarks, suggestions, or warnings.
>
> On Wed, 14 Sep 2016 17:04:53 +0200, Emmanuel Bourg wrote:
>
>> Hi all,
>>
>> RNG is moving fast, that's nice. I got a quick look at the API and the
>> site, here are my comments:
>>
>> - The main page (Overview) could be enhanced with a description of the
>> API longer than a single sentence. It could mention the motivations for
>> the API, the implementations supported, a sample code snippet and a link
>> to the user guide.
>>
>
> I don't understand "motivations for the API".
>
> Do you mean something like my comment in
>   https://issues.apache.org/jira/browse/RNG-6
> ?
>
> [My motivation for proposing this code can be gathered from
> the archives and the bugs and shortcomings reported in the
> JIRA MATH project, and related posts here). They do not
> belong in that "overview" page.]
>
> - In the Performance section of the User Guide, I'd suggest grouping the
>> 3 tables into a single one. It might also be interesting to put the
>> quality results in the same table, so we can see the trade-off between
>> performance and quality.
>>
>
> I won't do that.
> It was already a pain to create those tables as they are (see the
> doc source).
>
> - The License section of the user guide could be removed, I guess
>> everyone knows that Apache projects use the Apache license.
>>
>
> No problem.
> I thought that it was mandatory.
>
> All the pages were copied from the corresponding Commons Math ones.
>
> - The internal packages should be hidden from the Javadoc,
>>
>
> I do not know how do that.
>
> And I don't see the point, since from the "binary" POV, they
> are "public".
>
> otherwise
>> users may be tempted to directly use the classes there.
>>
>
> You must have seen that there are prominent warnings against
> doing that.
>
> Users will do what they want anyways.
> What is not supported is not supported.  We agreed here that it
> was enough to tag a class as "internal" in order to fulfill our
> obligation of least surprise.
>
> This would imply
>> copying the documentation of the generators into the RandomSource class.
>>
>
> I do not agree.
>
> Users can select a generator without knowing the gory details.
> If they want to know more, they go to the internals.
> It they really want to know, they go to the references.
>
> Internal should not be hidden: contributors who'd fancy to provide
> additional generators do not have to go through any hoops.
>
> Or do you suggest that this component should supply two JARs:
>   * commons-rng-api
>      Contents of "o.a.c.rng"
>   * commons-rng-core
>      Contents of "o.a.c.rng.internal"
> ?
>
> And two sets of Javadoc?
>
> - Why isn't the UniformRandomProvider interface simply named
>> RandomProvider?
>>
>
> Maybe because we want to be transparent on what the code does (?).
>
> With this name, there is no ambiguity: no more, no less than what
> the name implies.
>
> Is there any plan to implement non uniform random
>> generators in the future? Will they have a different interface?
>>
>
> This point has been discussed on this list (although, again, there
> was a severe lack of feedback...).
>
> Executive summary:
>   No strong coupling between generators (of uniform sequence of
>   pseudo-random numbers) and utilities on top of those generators
>   (e.g. generators producing non-uniform sequences).
>   Utilities should go in another component/package/module (see
>   e.g. Commons Math's packages "o.a.c.math4.distribution" and
>   "o.a.c.math4.random" for candidate code).
>
> - UniformRandomProvider mimics the java.util.Random methods which is
>> nice. Is there any value in also implementing the nextGaussian() method
>> such that our API is a perfect drop-in replacement for the JDK class?
>>
>
> No. [IMHO.]
> This was discussed on this list (and cf. above point).
>
> - For code relying on the java.util.Random type, it might be useful to
>> provide an adapter turning an UniformRandomProvider into a
>> java.util.Random implementation (Hipparchus has one).
>>
>
> This was discussed on this list.
>
> And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
> "o.a.c.math4.random.RngAdaptor" classes.
>
> In the latter (being deprecated as of its creation), I indicated why
> it is a bad idea.
>
> Again, if we really want to support the use-case you proposed, it's
> a utility that should go in another component.
>
> - The choice of an enum for the factory class RandomSource is a bit
>> unusual. It is probably ok for simple RNG with a simple seed, but with
>> more complex implementations I find it less intuitive to use.
>>
>
> Less intuitive than what?
>
> For example the create(RandomSource, Object Object...) method has this
>> snippet in its documentation:
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, 26219, 6, 9)
>>
>> When a user types this in its IDE, there is no type information nor
>> contextual documentation for the parameters expected. The user has to
>> know the type of the seed and the meaning of the parameters expected,
>> the IDE won't be able to help here (with CTRL+P and CTRL+Q in IntelliJ).
>>
>
> Honestly, I don't care.
> To put it less bluntly, the choice of RNG to instantiate is not a
> matter of IDE.
>
> Practically, the lack of type information (which I don't like either)
> was a trade-off to have a dead-simple API.
>
> As a user, one doesn't need an intelligent IDE to know all there is
> to know about Commons RNG, and it will take less than 10 minutes!
>
> This design induces some complexity, the loss of the seed type leads to
>> the seed conversion mechanism (most of the util package if I understand
>> well).
>>
>> I'd feel more comfortable with a more classic factory design, something
>> like:
>>
>>   RandomSource.createTwoCmres()
>>   RandomSource.createTwoCmres(Integer seed)
>>   RandomSource.createTwoCmres(Integer seed, int i, int j)
>>
>> or a shorter form like:
>>
>>   RNG.newTwoCmres()
>>   RNG.newTwoCmres(int seed)
>>   RNG.newTwoCmres(Integer seed, int i, int j)
>>
>> That will add more methods to the factory, but it'll be much more usable
>> in my opinion.
>>
>
> Not in mine.
>
> This design came after 9 months of sustained work.
> Nothing to be proud of in terms of LOC/day...
> But what you propose to drop is the only non trivial part of
> this work.
>
> As I said above, this is a trade-off, and it looks like a fault
> only because Java is strongly typed.
>
> The automatic seed conversion is a huge advantage for casual
> users; they can still seed them with a single "long" (and under
> the hood get a good "native" seed).
> But the doc is there to warn them that RNGs usually have a larger
> state.
>
> - I'm not convinced by the need for the convenience static methods
>> RandomSource.createInt/Long. If the user doesn't want to specify the
>> seed, he could simply use the seed-less create method. And if the random
>> generator requires extra parameters, he could use a null value for the
>> seed as a way to mean "generate one for me please":
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, null, 6, 9)
>>
>
> These methods take care of a common use-case, where the actual
> seed does not matter (hence the caller might like to not have to
> code similar methods), but the _same_ seed must be used in another
> execution of the same application to ensure reproducibility (e.g.
> for bug tracking).
>
> See also:
>   https://docs.oracle.com/javase/7/docs/api/java/security/
> SecureRandom.html#generateSeed%28int%29
>
> - RandomSource.saveState/restoreState allows one to convert a random
>> generator to/from a byte array. This looks a lot like Java
>> serialization. Maybe the implementations supporting this feature could
>> be marked as Serializable, thus it would be possible to call
>> RandomSource.saveState only if the instance is Serializable instead of
>> catching the UnsupportedOperationException thrown when the feature isn't
>> supported.
>>
>
> No, no, no.
> Avoiding the "Serializable" clutter was a _requirement_.
>
> Serialization is not part of this component, but is supported by all
> implementations.
>
> It's a feature that users can _choose_ the persistence framework
> they want and not be forced to consider the pitfalls of "Serializable".
> [See "Effective Java".]
>
> It's also a feature that this component will never have to deal
> with CVE related to exploitation of something "Serializable" that
> should not have been...
>
> - Why is the byte[] state encapsulated into the RandomSource.State
>> class? Why not using a byte array directly?
>>
>
> Because encapsulation is the hallmark of good OO programming?
>
> If some application only needs to save/restore without actual
> persistence, it can do without dealing with the "ugliness" of
> "byte[]".
> And it will continue to work even if the internal representation
> is changed.
>
> Exposing the internals was unavoidable for the serialization
> use-cases.
>
>
>> So far I'm +1 for a beta release aimed at getting more feedback on the
>> API,
>>
>
> There were a lot of discussions here about what "beta release"
> means and how to implement it in practice (package naming,
> versioning scheme).
>
> They never led anywhere.
>
> Snapshot JARs have been available since the Jenkins task has been
> up and running.
>
> I've asked for feedback on this list, on the site since it's been
> up, and on "helpwanted.apache.org".
>
> I seriously doubt that having code tagged "beta" would be more
> successful at getting users to provide feedback.
>
> It will more likely turn them away, thinking that the generators
> might be buggy.  Which is not the case.
>
> As much as I advocated it in order to have more timely releases
> of Commons Math, I think that in this case, a "beta" release
> (if such a thing existed in Commons) is nothing short of
> counter-productive.
>
> The API is what it is.  As for any other component; and, quite
> frankly, in proportion of the lines of code, much more thought
> was put into it than for anything in Commons Math, and probably
> many other Commons components).
>
> I do not deny that it might be wrong (or I wouldn't have filed
> "RNG-6") but at this point it is usage that should demonstrate
> it.  Not doubt based on... I don't know what.
>
> but I'm -1 for releasing the current code as 1.0.
>>
>
> Any technical reasons?
>
>
> Gilles
> [Truly sorry that this discussion did not happen at the proper
> time. And hoping that constructive feedback like this one will
> come when I start talking about "commons-rng-tools".]
>
>
>
>> Emmanuel Bourg
>>
>>
>> Le 11/09/2016 à 16:55, Gilles a écrit :
>>
>>> Hi.
>>>
>>> This is a [VOTE] for releasing Apache Commons Rng 1.0 (from RC1).
>>>
>>> Tag name:
>>>   RNG_1_0_RC1 (signature can be checked from git using 'git tag -v')
>>>
>>> Tag URL:
>>>
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=commons-rng.git;a=
>>> commit;h=f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>>
>>> Commit ID the tag points at:
>>>   f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>> Site:
>>>   http://home.apache.org/~erans/commons-rng-1.0-RC1-site
>>>
>>> Distribution files:
>>>   https://dist.apache.org/repos/dist/dev/commons/rng/
>>>
>>> Distribution files hashes (SHA1):
>>>   a221e862c8ff970a9ca3e7fbd86c3200d1f8780a commons-rng-1.0-bin.tar.gz
>>>   689b2bfbdb1856d4f47851d75762aab42057805a commons-rng-1.0-bin.zip
>>>   40b7b1639eedf91b5fad5d38e6ebec01e659048f commons-rng-1.0-src.tar.gz
>>>   6296dbabde10169d6365bda99f2af6dcc191e515 commons-rng-1.0-src.zip
>>>
>>> KEYS file to check signatures:
>>>   http://www.apache.org/dist/commons/KEYS
>>>
>>> Maven artifacts:
>>>
>>>
>>> https://repository.apache.org/content/repositories/orgapache
>>> commons-1199/org/apache/commons/commons-rng/1.0/
>>>
>>>
>>> [ ] +1 Release it.
>>> [ ] +0 Go ahead; I don't care.
>>> [ ] -0 There are a few minor glitches: ...
>>> [ ] -1 No, do not release it because ...
>>>
>>> This vote will close in 72 hours, at 2016-09-14T15:10:00Z (this is UTC
>>> time).
>>> ----------
>>>
>>> Thanks,
>>> Gilles
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Re: [RNG] About RNG-6

Posted by Gilles <gi...@harfang.homelinux.org>.
Hi.

On Sat, 17 Sep 2016 01:38:16 +0200, Emmanuel Bourg wrote:
> Le 15/09/2016 � 16:37, Gilles a �crit :
>
>> I don't understand "motivations for the API".
>
> I just mean a short introduction to the component, something 
> explaining
> why it exists, its usefulness (performance, randomness quality), the
> implementations supported, etc.

I took care of this by expanding the "Overview" page.

I/you/anyone can add links to interesting references about the
subject.

>> I won't do that.
>> It was already a pain to create those tables as they are (see the
>> doc source).
>
> I understand you don't want to do it, editing apt files isn't fun. 
> But
> are you opposed to the idea?

Personally, I prefer separate tables.

Something useful would be to have a script for creating them.

[I have one for the benchmark table (takes the output of the
JMH benchmark include in "src/test").  Parsing the output of
"TestU01" is slightly more challenging...]

>>> - The internal packages should be hidden from the Javadoc,
>>
>> I do not know how do that.
>
> Packages can be excluded from the Javadoc by configuring the
> maven-javadoc-plugin with the <excludePackageNames> element:
>
> 
> https://maven.apache.org/plugins/maven-javadoc-plugin/examples/exclude-package-names.html
>
>
>> And I don't see the point, since from the "binary" POV, they
>> are "public".
>
> That's purely from the documentation point of view here. If we don't
> want users to use these packages, let's not mention them in the 
> Javadoc.
> That's an extra security with the "internal" package name.

(false) Security through obscurity...

Users who don't read the docs deserve what they get. ;-)
More probably such users will only copy/paste from the examples.
And be safe.

>> Users can select a generator without knowing the gory details.
>> If they want to know more, they go to the internals.
>> It they really want to know, they go to the references.
>>
>> Internal should not be hidden: contributors who'd fancy to provide
>> additional generators do not have to go through any hoops.
>>
>> Or do you suggest that this component should supply two JARs:
>>   * commons-rng-api
>>      Contents of "o.a.c.rng"
>>   * commons-rng-core
>>      Contents of "o.a.c.rng.internal"
>> ?
>>
>> And two sets of Javadoc?
>
> No thanks, too complicated. Contributors will just checkout the code 
> and
> quickly figure out how it's organized and how it's meant to be 
> extended,
> I don't worry about that. But I worry about mere users getting 
> confused
> about the API.

I worry about the reverse.

The lack of regular contributors is rather more actual than the
occasional "lost newbie".

>>> - Why isn't the UniformRandomProvider interface simply named
>>> RandomProvider?
>>
>> Maybe because we want to be transparent on what the code does (?).
>>
>> With this name, there is no ambiguity: no more, no less than what
>> the name implies.
>
> I don't feel there is an ambiguity here, probably because I'm not an
> expert in random number generators.

Then, please consider that there may be an ambiguity.

> I just see an excessively long class
> name that could be shortened. CM4/Hipparchus just use 
> 'RandomGenerator'
> for this interface, that's in line with the name of the component and 
> a
> better choice in my opinion.

No. It's ambiguous.

Shortening is also not a concern because you don't often instantiate
it.  Usually, it would be done _once_ per program.

>>> - UniformRandomProvider mimics the java.util.Random methods which 
>>> is
>>> nice. Is there any value in also implementing the nextGaussian() 
>>> method
>>> such that our API is a perfect drop-in replacement for the JDK 
>>> class?
>>
>> No. [IMHO.]
>> This was discussed on this list (and cf. above point).
>
> That would be convenient for migrating code using java.util.Random to
> commons-rng though. CM has it and the implementation doesn't look so
> complicated, I'm not sure to understand why you think it should go
> elsewhere. Even if more elaborated implementations could be latter
> imagined there is no harm providing a default one as a convenience 
> for
> migrating legacy code.

There is harm in that it creates confusion.

Please look beyond "java.util.Random" for reference about what it
means to generate random numbers.

People who design generators mostly deal with how to make them
good at generating uniform sequences (see the "Quality" section
of the userguide).

Generating non-uniform sequences is another business.

 From a programming POV, there is nothing for singling out
the Gaussian distribution.

In a library for scientific use, the Normal distribution will
have a higher usage count thna other distributions, but all
of them may have their uses, and thus should be provided.
This is not the business of this low level component.

>>> - For code relying on the java.util.Random type, it might be useful 
>>> to
>>> provide an adapter turning an UniformRandomProvider into a
>>> java.util.Random implementation (Hipparchus has one).
>>
>> This was discussed on this list.
>>
>> And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
>> "o.a.c.math4.random.RngAdaptor" classes.
>>
>> In the latter (being deprecated as of its creation), I indicated why
>> it is a bad idea.
>>
>> Again, if we really want to support the use-case you proposed, it's
>> a utility that should go in another component.
>
> The component already converts a java.util.Random into our
> RandomProvider,

It is purely for comparison purpose; it should be obvious that
using "RandomSource.JDK" is not recommended beyond illustrative
purpose (such as the performance ratio).

> I tend to think it's obvious to support the reverse
> operation in the same component.

Not the business of a library focused on implementations.

Bridge, wrapper, whatever of the sort is a utility.
Let's talk about a higher-level utility that depend on the core
(this component).

>>> - The choice of an enum for the factory class RandomSource is a bit
>>> unusual. It is probably ok for simple RNG with a simple seed, but 
>>> with
>>> more complex implementations I find it less intuitive to use.
>>
>> Less intuitive than what?
>
> Less intuitive than other designs, such as the one I proposed.

As you've guessed, I don't agree.

Similar to the "long" name rationale, the factory is a marginal
part of usage.

>>> When a user types this in its IDE, there is no type information nor
>>> contextual documentation for the parameters expected. The user has 
>>> to
>>> know the type of the seed and the meaning of the parameters 
>>> expected,
>>> the IDE won't be able to help here (with CTRL+P and CTRL+Q in 
>>> IntelliJ).
>>
>> Honestly, I don't care.
>> To put it less bluntly, the choice of RNG to instantiate is not a
>> matter of IDE.
>
> I don't think you understand my point. I don't expect the IDE to pick
> the right RNG for my use case of course, that's impossible. But I 
> expect
> the IDE to help me using the API by hinting the type expected and
> providing contextual documentation about the method used. The current
> design makes this impossible. The users are forced to open the 
> Javadoc
> in their browser when they want to instantiate a generator they don't
> know or remember about. I don't think it's a good idea.

I did understand your point. And I didn't agree.

Mine is that don't need an intelligent IDE to use Commons RNG.
You just need to read the docs _once_ in your life.

To be honest, you took the sole example ("TWO_CMRES_SELECT") among
15 implementations where additional parameters (beyond source type
and seed value) are used.

This was my design choice in order to provide a family of generators
(of which "TWO_CMRES" is one member).

It could have been equally possible to make the 2 ints part of the
seed (as it is done in other families like "MWC_256").
Here I hesitated because of the risk to have too many seemingly
different seeds that would produce the same generator (and sequence).

But I doubt that there will be other more "complex" uses of the
additional parameters than as indices in a table.
So your point about the IDE is just moot.

>> Practically, the lack of type information (which I don't like 
>> either)
>> was a trade-off to have a dead-simple API.
>
> I'd say it's certainly a concise API, but its simplicity is 
> debatable.

Then anything is.
And I don't agree.

>> This design came after 9 months of sustained work.
>> Nothing to be proud of in terms of LOC/day...
>> But what you propose to drop is the only non trivial part of
>> this work.
>>
>> As I said above, this is a trade-off, and it looks like a fault
>> only because Java is strongly typed.
>>
>> The automatic seed conversion is a huge advantage for casual
>> users; they can still seed them with a single "long" (and under
>> the hood get a good "native" seed).
>> But the doc is there to warn them that RNGs usually have a larger
>> state.
>
> What's the use case for accepting a long[] if the API expects a 
> int[]?
> That's a bit odd.

It's not; that's all the simplicity of the factory API: a user can
instantiate any generator with the same seed.

> At least, instead of using an Object in the create method, we could 
> use
> a dedicated Seed class containing factory methods and the conversion
> logic. Its API could look like this:
>
>   Seed.of(int...)
>   Seed.of(long...)
>   Seed.toInt()
>   Seed.toLong()
>   Seed.toIntArray()
>   Seed.toLongArray()
>   Seed.createIntArray(n)
>   Seed.createLongArray(n)


"SeedFactory" (internal) and API methods in "RandomSource"
do that (IIUC).

As an aside, I share your instinctive reluctance to deal
with "Object" argument.
But after _much_ pondering, this is surely one case where
it is undoubtedly adequate: the seed is _really_ not of a
definite type, it only needs to be a sequence of bits that
can be transformed in a good initial state.

Hence having several methods, like there were in CM, one
taking a "long", another an "int" and yet another an "int[]"
was plainly misleading.

That was confusing: why several seeding methods?

This is all mentioned in the Javadoc.

>> No, no, no.
>> Avoiding the "Serializable" clutter was a _requirement_.
>>
>> [...]
>>
>> It's also a feature that this component will never have to deal
>> with CVE related to exploitation of something "Serializable" that
>> should not have been...
>
> The random generators just hold a few primitive variables, there is 
> no
> risk here.

"Serializable" is to be avoided (even if you like to skip the
reference I put there to back the claim, which is not mine).

>>> - Why is the byte[] state encapsulated into the RandomSource.State
>>> class? Why not using a byte array directly?
>>
>> Because encapsulation is the hallmark of good OO programming?
>
> True but the RandomSource.State class isn't encapsulating the actual
> state of the generators here. It encapsulates an image of the 
> generator
> state. Even if the content of the byte array was changed it would not
> impact the state of the generator. This extra layer is really 
> unnecessary.

On the one hand you want to hide the Javadoc, but on the other, when
there is a little measure of "Don't fiddle with the representation
of the state", you want to expose it fully.

I repeat, there is the use-case of save/restore without persistence
where it is nicer to just deal with a "State" object rather than an
array object.

You are of course right that it is purely cosmetic.  I repeat that
I had to do it for the sake of the persistence use-case.

>> If some application only needs to save/restore without actual
>> persistence, it can do without dealing with the "ugliness" of
>> "byte[]".
>
> byte[] is ugly???

Yes.
Why would the "state" be an array.
In fact it is not in many implementations, but they provide a
transformation into "byte[]" in order to allow easy serialization,
without "Serializable"

>> And it will continue to work even if the internal representation
>> is changed.
>
> Why would we change that?


This is far-fetched, I know.
I was thinking of additional measures to check the integrity
of the state...

>> There were a lot of discussions here about what "beta release"
>> means and how to implement it in practice (package naming,
>> versioning scheme).
>>
>> They never led anywhere.
>
> Did you forget my suggestion to use a different package name and
> artifactId for the beta releases?

I don't remember exactly who said what.

I do remember though that I was in favour of something that would
speed up releases held back because of the monstrous nature of
Commons Math.

This problem has another solution: small components such as
Commons RNG.

This code which I proposed to release has been developed in
the same arena as anything else here in plain sight.

Nobody had anything to say about it for months.

I don't think it's serious to consider that potential users who
might have a need for a "java random generator" will be those
suscribed to Commons' "announce" ML.

Much more likely they'll do a search for what's available on
repositories.

>> I seriously doubt that having code tagged "beta" would be more
>> successful at getting users to provide feedback.
>
> A beta release will get more exposure than a snapshot, because it'll 
> be
> announced on the mailing lists, on Twitter and available on Maven 
> Central.
>
>> It will more likely turn them away, thinking that the generators
>> might be buggy.  Which is not the case.
>
> Not if we explain that the code is inherited from CM and very mature 
> but
> moved to a new component with a reworked API.

Who will do that dissertation?

The code is not good because it is inherited from CM.

The code was written because of the many defects (IMHO) of what
is/was in CM and in "java.util.Random".
It was written to be a repository of alternative generators with most
features needed for simple usage and none of the misfeatures which
casual users might be tempted to abuse (such as a "reseed" method,
see the Javadoc).

The code is good primarily because the uniformity was tested with
"TestU01" (and users can run it themselves).
This test was not available in released version of CM.

That code also provides (much) faster implementation than were
available in CM.
Although the CM "team" did all it could to prevent me from looking
in that direction.

And it feels a bit like the same right now, by attempting to
thwart the release of a code that is certainly not worse than
what was released for years without such a last-minute fuss.

>>> but I'm -1 for releasing the current code as 1.0.
>>
>> Any technical reasons?
>
> Simply the reasons I mentioned that can't be addressed in a later
> release without breaking the binary compatibility:
> - UniformRandomProvider is too long, rename back to RandomGenerator

No.

> - Add nextGaussian() to the UniformRandomProvider interface

No.

> - RandomSource hides type information and parameters' semantic

True but "not a problem".

> - RandomSource.State isn't necessary

True but "not a problem".

> Emmanuel Bourg
>
> (My apologies for the late feedback, I wish I had more time to do it
> earlier)

We are all free to set our priorities however we want.

I respect yours (of not having had the time to help with
the development of this component).

Please respect mine now.
I "did the work" and I think it's far from being so bad as to
deserve a -1.

You cannot reasonably sustain in the same breath that it was
impossible to look at 92 lines of code (the size of the public
API) for the last 3 to 6 months, but that you can valuably
question months of work, in a single mail, precisely when the
release vote is started.

People will use this code, or not, but nobody here can claim
to know better about this subject so as to come up with a
viable alternative.
And if they did now better, then they should have said so at
the proper time.

Flaws are quite possible of course.
But actual use is to uncover them (and we'll fix them when
they are reported), not mileage talk that only serves to
further delay the release.

There are missing features, for sure, such as "jump ahead",
but enabling this would require a lot of additional commitment,
and when done, it will certainly be worth a new release.


Regards,
Gilles

[1] I recall that the development version of CM depends on
     Commons RNG (~170 uses of "UniforRandomProvider" and
     "RandomSource"), and it all still works...


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [RNG] About RNG-6 (Was: [VOTE][RC1] Release Commons Rng 1.0)

Posted by Emmanuel Bourg <eb...@apache.org>.
Le 15/09/2016 � 16:37, Gilles a �crit :

> I don't understand "motivations for the API".

I just mean a short introduction to the component, something explaining
why it exists, its usefulness (performance, randomness quality), the
implementations supported, etc.


> I won't do that.
> It was already a pain to create those tables as they are (see the
> doc source).

I understand you don't want to do it, editing apt files isn't fun. But
are you opposed to the idea?


>> - The internal packages should be hidden from the Javadoc,
> 
> I do not know how do that.

Packages can be excluded from the Javadoc by configuring the
maven-javadoc-plugin with the <excludePackageNames> element:

https://maven.apache.org/plugins/maven-javadoc-plugin/examples/exclude-package-names.html


> And I don't see the point, since from the "binary" POV, they
> are "public".

That's purely from the documentation point of view here. If we don't
want users to use these packages, let's not mention them in the Javadoc.
That's an extra security with the "internal" package name.


> Users can select a generator without knowing the gory details.
> If they want to know more, they go to the internals.
> It they really want to know, they go to the references.
> 
> Internal should not be hidden: contributors who'd fancy to provide
> additional generators do not have to go through any hoops.
> 
> Or do you suggest that this component should supply two JARs:
>   * commons-rng-api
>      Contents of "o.a.c.rng"
>   * commons-rng-core
>      Contents of "o.a.c.rng.internal"
> ?
> 
> And two sets of Javadoc?

No thanks, too complicated. Contributors will just checkout the code and
quickly figure out how it's organized and how it's meant to be extended,
I don't worry about that. But I worry about mere users getting confused
about the API.


>> - Why isn't the UniformRandomProvider interface simply named
>> RandomProvider?
> 
> Maybe because we want to be transparent on what the code does (?).
> 
> With this name, there is no ambiguity: no more, no less than what
> the name implies.

I don't feel there is an ambiguity here, probably because I'm not an
expert in random number generators. I just see an excessively long class
name that could be shortened. CM4/Hipparchus just use 'RandomGenerator'
for this interface, that's in line with the name of the component and a
better choice in my opinion.


>> - UniformRandomProvider mimics the java.util.Random methods which is
>> nice. Is there any value in also implementing the nextGaussian() method
>> such that our API is a perfect drop-in replacement for the JDK class?
> 
> No. [IMHO.]
> This was discussed on this list (and cf. above point).

That would be convenient for migrating code using java.util.Random to
commons-rng though. CM has it and the implementation doesn't look so
complicated, I'm not sure to understand why you think it should go
elsewhere. Even if more elaborated implementations could be latter
imagined there is no harm providing a default one as a convenience for
migrating legacy code.


>> - For code relying on the java.util.Random type, it might be useful to
>> provide an adapter turning an UniformRandomProvider into a
>> java.util.Random implementation (Hipparchus has one).
> 
> This was discussed on this list.
> 
> And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
> "o.a.c.math4.random.RngAdaptor" classes.
> 
> In the latter (being deprecated as of its creation), I indicated why
> it is a bad idea.
> 
> Again, if we really want to support the use-case you proposed, it's
> a utility that should go in another component.

The component already converts a java.util.Random into our
RandomProvider, I tend to think it's obvious to support the reverse
operation in the same component.


>> - The choice of an enum for the factory class RandomSource is a bit
>> unusual. It is probably ok for simple RNG with a simple seed, but with
>> more complex implementations I find it less intuitive to use.
> 
> Less intuitive than what?

Less intuitive than other designs, such as the one I proposed.


>> When a user types this in its IDE, there is no type information nor
>> contextual documentation for the parameters expected. The user has to
>> know the type of the seed and the meaning of the parameters expected,
>> the IDE won't be able to help here (with CTRL+P and CTRL+Q in IntelliJ).
> 
> Honestly, I don't care.
> To put it less bluntly, the choice of RNG to instantiate is not a
> matter of IDE.

I don't think you understand my point. I don't expect the IDE to pick
the right RNG for my use case of course, that's impossible. But I expect
the IDE to help me using the API by hinting the type expected and
providing contextual documentation about the method used. The current
design makes this impossible. The users are forced to open the Javadoc
in their browser when they want to instantiate a generator they don't
know or remember about. I don't think it's a good idea.


> Practically, the lack of type information (which I don't like either)
> was a trade-off to have a dead-simple API.

I'd say it's certainly a concise API, but its simplicity is debatable.


> This design came after 9 months of sustained work.
> Nothing to be proud of in terms of LOC/day...
> But what you propose to drop is the only non trivial part of
> this work.
> 
> As I said above, this is a trade-off, and it looks like a fault
> only because Java is strongly typed.
> 
> The automatic seed conversion is a huge advantage for casual
> users; they can still seed them with a single "long" (and under
> the hood get a good "native" seed).
> But the doc is there to warn them that RNGs usually have a larger
> state.

What's the use case for accepting a long[] if the API expects a int[]?
That's a bit odd.

At least, instead of using an Object in the create method, we could use
a dedicated Seed class containing factory methods and the conversion
logic. Its API could look like this:

  Seed.of(int...)
  Seed.of(long...)
  Seed.toInt()
  Seed.toLong()
  Seed.toIntArray()
  Seed.toLongArray()
  Seed.createIntArray(n)
  Seed.createLongArray(n)


> No, no, no.
> Avoiding the "Serializable" clutter was a _requirement_.
> 
> [...]
> 
> It's also a feature that this component will never have to deal
> with CVE related to exploitation of something "Serializable" that
> should not have been...

The random generators just hold a few primitive variables, there is no
risk here.


>> - Why is the byte[] state encapsulated into the RandomSource.State
>> class? Why not using a byte array directly?
> 
> Because encapsulation is the hallmark of good OO programming?

True but the RandomSource.State class isn't encapsulating the actual
state of the generators here. It encapsulates an image of the generator
state. Even if the content of the byte array was changed it would not
impact the state of the generator. This extra layer is really unnecessary.


> If some application only needs to save/restore without actual
> persistence, it can do without dealing with the "ugliness" of
> "byte[]".

byte[] is ugly???

> And it will continue to work even if the internal representation
> is changed.

Why would we change that?


> There were a lot of discussions here about what "beta release"
> means and how to implement it in practice (package naming,
> versioning scheme).
> 
> They never led anywhere.

Did you forget my suggestion to use a different package name and
artifactId for the beta releases?


> I seriously doubt that having code tagged "beta" would be more
> successful at getting users to provide feedback.

A beta release will get more exposure than a snapshot, because it'll be
announced on the mailing lists, on Twitter and available on Maven Central.

> It will more likely turn them away, thinking that the generators
> might be buggy.  Which is not the case.

Not if we explain that the code is inherited from CM and very mature but
moved to a new component with a reworked API.

>> but I'm -1 for releasing the current code as 1.0.
> 
> Any technical reasons?

Simply the reasons I mentioned that can't be addressed in a later
release without breaking the binary compatibility:
- UniformRandomProvider is too long, rename back to RandomGenerator
- Add nextGaussian() to the UniformRandomProvider interface
- RandomSource hides type information and parameters' semantic
- RandomSource.State isn't necessary

Emmanuel Bourg

(My apologies for the late feedback, I wish I had more time to do it
earlier)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [RNG] About RNG-6 (Was: [VOTE][RC1] Release Commons Rng 1.0)

Posted by Gary Gregory <ga...@gmail.com>.
On Thu, Sep 15, 2016 at 7:37 AM, Gilles <gi...@harfang.homelinux.org>
wrote:

> Hi.
>
> Moving to another thread, as your comments do not belong in
> a release vote.
>
> Many of them could lead to interesting discussions that should
> (and could) have happened earlier.
>
> Indeed, for the record, see
>   https://issues.apache.org/jira/browse/RNG-6
> and the numerous posts which I sent to this list
>  * while ironing out the code of this new component in its own
>    repository, and doing the various chores (site, userguide,
>    quality testing, Github, Travis, Coveralls, etc) for the last
>    6 weeks,
>  * while developing that code in a new Commons Math package
>    "o.a.c.math4.rng" for the last 6 months,
>  * while fixing the code in Commons Math "o.a.c.m.random"
>    package for the last 9 months.
>
> For a number of issues which you now raise, I've explicitly
> asked for input.
>
> I find it a bit strange that, at this precise point, you start
> to question the choices which I've made in the absence of any
> prior remarks, suggestions, or warnings.
>
> On Wed, 14 Sep 2016 17:04:53 +0200, Emmanuel Bourg wrote:
>
>> Hi all,
>>
>> RNG is moving fast, that's nice. I got a quick look at the API and the
>> site, here are my comments:
>>
>> - The main page (Overview) could be enhanced with a description of the
>> API longer than a single sentence. It could mention the motivations for
>> the API, the implementations supported, a sample code snippet and a link
>> to the user guide.
>>
>
> I don't understand "motivations for the API".
>
> Do you mean something like my comment in
>   https://issues.apache.org/jira/browse/RNG-6
> ?
>
> [My motivation for proposing this code can be gathered from
> the archives and the bugs and shortcomings reported in the
> JIRA MATH project, and related posts here). They do not
> belong in that "overview" page.]
>
> - In the Performance section of the User Guide, I'd suggest grouping the
>> 3 tables into a single one. It might also be interesting to put the
>> quality results in the same table, so we can see the trade-off between
>> performance and quality.
>>
>
> I won't do that.
> It was already a pain to create those tables as they are (see the
> doc source).
>
> - The License section of the user guide could be removed, I guess
>> everyone knows that Apache projects use the Apache license.
>>
>
> No problem.
> I thought that it was mandatory.
>
> All the pages were copied from the corresponding Commons Math ones.
>
> - The internal packages should be hidden from the Javadoc,
>>
>
> I do not know how do that.
>

Please see:

https://maven.apache.org/plugins/maven-javadoc-plugin/javadoc-mojo.html#excludePackageNames

Gary

>
> And I don't see the point, since from the "binary" POV, they
> are "public".
>
> otherwise
>> users may be tempted to directly use the classes there.
>>
>
> You must have seen that there are prominent warnings against
> doing that.
>
> Users will do what they want anyways.
> What is not supported is not supported.  We agreed here that it
> was enough to tag a class as "internal" in order to fulfill our
> obligation of least surprise.
>
> This would imply
>> copying the documentation of the generators into the RandomSource class.
>>
>
> I do not agree.
>
> Users can select a generator without knowing the gory details.
> If they want to know more, they go to the internals.
> It they really want to know, they go to the references.
>
> Internal should not be hidden: contributors who'd fancy to provide
> additional generators do not have to go through any hoops.
>
> Or do you suggest that this component should supply two JARs:
>   * commons-rng-api
>      Contents of "o.a.c.rng"
>   * commons-rng-core
>      Contents of "o.a.c.rng.internal"
> ?
>
> And two sets of Javadoc?
>
> - Why isn't the UniformRandomProvider interface simply named
>> RandomProvider?
>>
>
> Maybe because we want to be transparent on what the code does (?).
>
> With this name, there is no ambiguity: no more, no less than what
> the name implies.
>
> Is there any plan to implement non uniform random
>> generators in the future? Will they have a different interface?
>>
>
> This point has been discussed on this list (although, again, there
> was a severe lack of feedback...).
>
> Executive summary:
>   No strong coupling between generators (of uniform sequence of
>   pseudo-random numbers) and utilities on top of those generators
>   (e.g. generators producing non-uniform sequences).
>   Utilities should go in another component/package/module (see
>   e.g. Commons Math's packages "o.a.c.math4.distribution" and
>   "o.a.c.math4.random" for candidate code).
>
> - UniformRandomProvider mimics the java.util.Random methods which is
>> nice. Is there any value in also implementing the nextGaussian() method
>> such that our API is a perfect drop-in replacement for the JDK class?
>>
>
> No. [IMHO.]
> This was discussed on this list (and cf. above point).
>
> - For code relying on the java.util.Random type, it might be useful to
>> provide an adapter turning an UniformRandomProvider into a
>> java.util.Random implementation (Hipparchus has one).
>>
>
> This was discussed on this list.
>
> And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
> "o.a.c.math4.random.RngAdaptor" classes.
>
> In the latter (being deprecated as of its creation), I indicated why
> it is a bad idea.
>
> Again, if we really want to support the use-case you proposed, it's
> a utility that should go in another component.
>
> - The choice of an enum for the factory class RandomSource is a bit
>> unusual. It is probably ok for simple RNG with a simple seed, but with
>> more complex implementations I find it less intuitive to use.
>>
>
> Less intuitive than what?
>
> For example the create(RandomSource, Object Object...) method has this
>> snippet in its documentation:
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, 26219, 6, 9)
>>
>> When a user types this in its IDE, there is no type information nor
>> contextual documentation for the parameters expected. The user has to
>> know the type of the seed and the meaning of the parameters expected,
>> the IDE won't be able to help here (with CTRL+P and CTRL+Q in IntelliJ).
>>
>
> Honestly, I don't care.
> To put it less bluntly, the choice of RNG to instantiate is not a
> matter of IDE.
>
> Practically, the lack of type information (which I don't like either)
> was a trade-off to have a dead-simple API.
>
> As a user, one doesn't need an intelligent IDE to know all there is
> to know about Commons RNG, and it will take less than 10 minutes!
>
> This design induces some complexity, the loss of the seed type leads to
>> the seed conversion mechanism (most of the util package if I understand
>> well).
>>
>> I'd feel more comfortable with a more classic factory design, something
>> like:
>>
>>   RandomSource.createTwoCmres()
>>   RandomSource.createTwoCmres(Integer seed)
>>   RandomSource.createTwoCmres(Integer seed, int i, int j)
>>
>> or a shorter form like:
>>
>>   RNG.newTwoCmres()
>>   RNG.newTwoCmres(int seed)
>>   RNG.newTwoCmres(Integer seed, int i, int j)
>>
>> That will add more methods to the factory, but it'll be much more usable
>> in my opinion.
>>
>
> Not in mine.
>
> This design came after 9 months of sustained work.
> Nothing to be proud of in terms of LOC/day...
> But what you propose to drop is the only non trivial part of
> this work.
>
> As I said above, this is a trade-off, and it looks like a fault
> only because Java is strongly typed.
>
> The automatic seed conversion is a huge advantage for casual
> users; they can still seed them with a single "long" (and under
> the hood get a good "native" seed).
> But the doc is there to warn them that RNGs usually have a larger
> state.
>
> - I'm not convinced by the need for the convenience static methods
>> RandomSource.createInt/Long. If the user doesn't want to specify the
>> seed, he could simply use the seed-less create method. And if the random
>> generator requires extra parameters, he could use a null value for the
>> seed as a way to mean "generate one for me please":
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, null, 6, 9)
>>
>
> These methods take care of a common use-case, where the actual
> seed does not matter (hence the caller might like to not have to
> code similar methods), but the _same_ seed must be used in another
> execution of the same application to ensure reproducibility (e.g.
> for bug tracking).
>
> See also:
>   https://docs.oracle.com/javase/7/docs/api/java/security/
> SecureRandom.html#generateSeed%28int%29
>
> - RandomSource.saveState/restoreState allows one to convert a random
>> generator to/from a byte array. This looks a lot like Java
>> serialization. Maybe the implementations supporting this feature could
>> be marked as Serializable, thus it would be possible to call
>> RandomSource.saveState only if the instance is Serializable instead of
>> catching the UnsupportedOperationException thrown when the feature isn't
>> supported.
>>
>
> No, no, no.
> Avoiding the "Serializable" clutter was a _requirement_.
>
> Serialization is not part of this component, but is supported by all
> implementations.
>
> It's a feature that users can _choose_ the persistence framework
> they want and not be forced to consider the pitfalls of "Serializable".
> [See "Effective Java".]
>
> It's also a feature that this component will never have to deal
> with CVE related to exploitation of something "Serializable" that
> should not have been...
>
> - Why is the byte[] state encapsulated into the RandomSource.State
>> class? Why not using a byte array directly?
>>
>
> Because encapsulation is the hallmark of good OO programming?
>
> If some application only needs to save/restore without actual
> persistence, it can do without dealing with the "ugliness" of
> "byte[]".
> And it will continue to work even if the internal representation
> is changed.
>
> Exposing the internals was unavoidable for the serialization
> use-cases.
>
>
>> So far I'm +1 for a beta release aimed at getting more feedback on the
>> API,
>>
>
> There were a lot of discussions here about what "beta release"
> means and how to implement it in practice (package naming,
> versioning scheme).
>
> They never led anywhere.
>
> Snapshot JARs have been available since the Jenkins task has been
> up and running.
>
> I've asked for feedback on this list, on the site since it's been
> up, and on "helpwanted.apache.org".
>
> I seriously doubt that having code tagged "beta" would be more
> successful at getting users to provide feedback.
>
> It will more likely turn them away, thinking that the generators
> might be buggy.  Which is not the case.
>
> As much as I advocated it in order to have more timely releases
> of Commons Math, I think that in this case, a "beta" release
> (if such a thing existed in Commons) is nothing short of
> counter-productive.
>
> The API is what it is.  As for any other component; and, quite
> frankly, in proportion of the lines of code, much more thought
> was put into it than for anything in Commons Math, and probably
> many other Commons components).
>
> I do not deny that it might be wrong (or I wouldn't have filed
> "RNG-6") but at this point it is usage that should demonstrate
> it.  Not doubt based on... I don't know what.
>
> but I'm -1 for releasing the current code as 1.0.
>>
>
> Any technical reasons?
>
>
> Gilles
> [Truly sorry that this discussion did not happen at the proper
> time. And hoping that constructive feedback like this one will
> come when I start talking about "commons-rng-tools".]
>
>
>
>> Emmanuel Bourg
>>
>>
>> Le 11/09/2016 à 16:55, Gilles a écrit :
>>
>>> Hi.
>>>
>>> This is a [VOTE] for releasing Apache Commons Rng 1.0 (from RC1).
>>>
>>> Tag name:
>>>   RNG_1_0_RC1 (signature can be checked from git using 'git tag -v')
>>>
>>> Tag URL:
>>>
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=commons-rng.git;a=
>>> commit;h=f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>>
>>> Commit ID the tag points at:
>>>   f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>> Site:
>>>   http://home.apache.org/~erans/commons-rng-1.0-RC1-site
>>>
>>> Distribution files:
>>>   https://dist.apache.org/repos/dist/dev/commons/rng/
>>>
>>> Distribution files hashes (SHA1):
>>>   a221e862c8ff970a9ca3e7fbd86c3200d1f8780a commons-rng-1.0-bin.tar.gz
>>>   689b2bfbdb1856d4f47851d75762aab42057805a commons-rng-1.0-bin.zip
>>>   40b7b1639eedf91b5fad5d38e6ebec01e659048f commons-rng-1.0-src.tar.gz
>>>   6296dbabde10169d6365bda99f2af6dcc191e515 commons-rng-1.0-src.zip
>>>
>>> KEYS file to check signatures:
>>>   http://www.apache.org/dist/commons/KEYS
>>>
>>> Maven artifacts:
>>>
>>>
>>> https://repository.apache.org/content/repositories/orgapache
>>> commons-1199/org/apache/commons/commons-rng/1.0/
>>>
>>>
>>> [ ] +1 Release it.
>>> [ ] +0 Go ahead; I don't care.
>>> [ ] -0 There are a few minor glitches: ...
>>> [ ] -1 No, do not release it because ...
>>>
>>> This vote will close in 72 hours, at 2016-09-14T15:10:00Z (this is UTC
>>> time).
>>> ----------
>>>
>>> Thanks,
>>> Gilles
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory