You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@gmail.com> on 2019/07/30 09:56:13 UTC

Re: [rng] Releasing 1.3

Hello.

Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
>
>
> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>> P.S. Thinking of releasing 1.3?
> >> Not yet. I think there are a few outstanding items [...]

Anything missing?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Gilles Sadowski <gi...@gmail.com>.
> > [...]
> > [...] is that check
> > ---CUT---
> >          if (!(source.normalized instanceof SharedStateSampler<?>)) {
> >              throw new UnsupportedOperationException("The underlying
> > sampler is not a SharedStateSampler");
> >          }
> > ---CUT---
> > necessary?
>
> That was already in the codebase. I added it to support
> SharedStateSampler in RNG-102.
>
> The withUniformRandomProvider method is documented to:
>
> @throws UnsupportedOperationException if the underlying sampler is not
> a {@link SharedStateSampler}.
> @throws ClassCastException if the underlying {@link SharedStateSampler}
> does not return a {@link NormalizedGaussianSampler}.
>
> So currently the two cases can be distinguished. If we drop the
>
> source.normalized instanceof SharedStateSampler<?>
>
> check then the next part will throw a ClassCastException if either case
> is true.
>
> So the question is really what exception should be raised in the rare
> case that the operation cannot be performed (because the user created
> with a NormalizedGaussianSampler not from the library). The
> ClassCastException (which indicates the executing code did not check
> what it can do) could be caught and rethrown as an
> UnsupportedOperationException. Or the object could be tested with
> instanceof before the cast. This would at least be consistent and show
> the code knows what it is doing. So if a user calls:
>
> GaussianSampler.withUniformRandomProvider(rng)
>
> The only exception they have to handle is an
> UnsupportedOperationException. An alternative would be an
> IllegalStateException but that would indicate that the method would be
> possible but not at this time. I think this would be better than a
> ClassCastException.
>
> My vote is to make UnsupportedOperationException the only exception to
> be thrown. The ClassCastException to allow the cases to be separated is
> unnecessary.

I'm fine with that.

Regards,
Gilles

>
> I can change this in master. It is only in GaussianSampler and
> LogNormalSampler. There are already unit tests for the edge cases so
> these just have to be updated to expect the appropriate documented
> exception.
>
>
> > [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Alex Herbert <al...@gmail.com>.
On 31/07/2019 00:49, Gilles Sadowski wrote:
> Le mar. 30 juil. 2019 à 22:07, Alex Herbert <al...@gmail.com> a écrit :
>>
>>
>>> On 30 Jul 2019, at 19:28, Gilles Sadowski <gi...@gmail.com> wrote:
>>>
>>> Hi.
>>>
>>> Le mar. 30 juil. 2019 à 15:38, Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>> a écrit :
>>>> On 30/07/2019 10:56, Gilles Sadowski wrote:
>>>>> Hello.
>>>>>
>>>>> Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
>>>>>> On 10/06/2019 15:31, Gilles Sadowski wrote:
>>>>>>>>> P.S. Thinking of releasing 1.3?
>>>>>>>> Not yet. I think there are a few outstanding items [...]
>>>>> Anything missing?
>>>> - RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have
>>>> a review [1]. I've left this while we finished GSoC phase 2 but it is ready.
>>>>
>>>> I added factory methods for all samplers. For existing samplers this is
>>>> just for consistency. Some however use internal delegates and the
>>>> factory method can return the delegate directly which is an advantage.
>>>>
>>>> One issue to look at is how I handled GaussianSampler and
>>>> LogNormalSampler. The samplers can only be shared state samplers if the
>>>> input NormalizedGaussianSampler is a shared state sampler. I handled
>>>> this with documentation. But this means a downstream user may be passed
>>>> a SharedStateContinuousSampler, use it as such and receive an exception
>>>> if it was created incorrectly.
>>> Could the library create it "incorrectly”?
>> No. All the NormalizedGaussianSamplers in the library are suitable to pass to the GaussianSampler and LogNormalSampler.
>>
>>>> The alternative is two factory methods which must have different names
>>>> due to type erasure:
>>>>
>>>> public static ContinuousSampler of(NormalizedGaussianSampler gaussian,
>>>> double scale, double shape);
>>>>
>>>> public static
>>>>      <T extends NormalizedGaussianSampler &
>>>> SharedStateSampler<ContinuousSampler>>
>>>>      SharedStateContinuousSampler
>>>>      ofSharedState(T normalized,
>>>>                    double mean,
>>>>                    double standardDeviation) {
>>> Not nice, at first sight.
>>>
>>>> So the options are:
>>>>
>>>> - As current but has the pitfall of throwing exceptions if you do create
>>>> a one with something that does not share state (i.e. not a sampler in
>>>> the library).
>>> IIUC, it answers my question above.
>>> I would not consider too much that the interfaces defined in our "client API"
>>> module could be implemented by external codes.
>>> Even within "Commons", we do not use other components' API…
>> OK. User beware documentation. If using only our library it will not be an issue.
>>
>> So I take it that you are fine with the PR?
> Yes.
> I didn't look at all the details but is that check
> ---CUT---
>          if (!(source.normalized instanceof SharedStateSampler<?>)) {
>              throw new UnsupportedOperationException("The underlying
> sampler is not a SharedStateSampler");
>          }
> ---CUT---
> necessary?

That was already in the codebase. I added it to support 
SharedStateSampler in RNG-102.

The withUniformRandomProvider method is documented to:

@throws UnsupportedOperationException if the underlying sampler is not
a {@link SharedStateSampler}.
@throws ClassCastException if the underlying {@link SharedStateSampler}
does not return a {@link NormalizedGaussianSampler}.

So currently the two cases can be distinguished. If we drop the

source.normalized instanceof SharedStateSampler<?>

check then the next part will throw a ClassCastException if either case 
is true.

So the question is really what exception should be raised in the rare 
case that the operation cannot be performed (because the user created 
with a NormalizedGaussianSampler not from the library). The 
ClassCastException (which indicates the executing code did not check 
what it can do) could be caught and rethrown as an 
UnsupportedOperationException. Or the object could be tested with 
instanceof before the cast. This would at least be consistent and show 
the code knows what it is doing. So if a user calls:

GaussianSampler.withUniformRandomProvider(rng)

The only exception they have to handle is an 
UnsupportedOperationException. An alternative would be an 
IllegalStateException but that would indicate that the method would be 
possible but not at this time. I think this would be better than a 
ClassCastException.

My vote is to make UnsupportedOperationException the only exception to 
be thrown. The ClassCastException to allow the cases to be separated is 
unnecessary.

I can change this in master. It is only in GaussianSampler and 
LogNormalSampler. There are already unit tests for the edge cases so 
these just have to be updated to expect the appropriate documented 
exception.


>
>> Just rereading it now and I’m not totally sold on the factory constructors using .of(…). Here’s Joshua Bloch on the matter [1]:
>>
>> —
>> Here are some common names for static factory methods:
>>
>>          • valueOf—Returns an instance that has, loosely speaking, the same value as its parameters. Such static factories are effectively type-conversion methods.
>>
>>          • of—A concise alternative to valueOf, popularized by EnumSet (Item 32).
>>
>>          • getInstance—Returns an instance that is described by the parameters but cannot be said to have the same value. In the case of a singleton, getInstance takes no parameters and returns the sole instance.
>>
>>          • newInstance—Like getInstance, except that newInstance guarantees that each instance returned is distinct from all others.
>>
>>          • getType—Like getInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.
>>
>>          • newType—Like newInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.
>> —
>>
>> The ’newType’ may be appropriate as the factory method is returning an instance of an interface not defined in the class. This is required for some implementations such as the PoissonSampler which returns either a SmallMean or LargeMeanPoissonSampler. So the Bloch naming could be ’newSampler’. But then we have over verbose code using ‘Sampler' twice:
>>
>> new PoissonSampler(rng, mean)
>>
>> PoissonSampler.of(rng, mean)
>>
>> PoissonSampler.newSampler(rng, mean)
>>
>> IMO the later is too verbose. If we state that the sampler is entirely defined by the input arguments to ‘of’ then it does satisfy "has, loosely speaking, the same value as its parameters”.
>>
>> So ‘of’ is OK. The only other words I like are ‘from’ or ‘with’:
>>
>> PoissonSampler.of(rng, mean)
>> PoissonSampler.from(rng, mean)
>> PoissonSampler.with(rng, mean)
>>
> "with" seems the most appropriate but has the disadvantage of not
> being in Bloch's list.
> I would then stick with "of" because "from" looks like the arguments
> will be transformed
> (and it also is not in the list...).
>
> Gilles
>
>> WDYT?
>>
>> [1] http://www.informit.com/articles/article.aspx?p=1216151 <http://www.informit.com/articles/article.aspx?p=1216151>
>>>> - Another factory method to explicitly create a SharedStateSampler using
>>>> a normalised Gaussian SharedStateSampler.
>>>>
>>>>
>>>> A few things that are 90% done:
>>>>
>>>> - RNG-85: MiddleSquareWeylSequence generator
>>>>
>>>> This is simple code and now the modifications have been made to the
>>>> ProviderBuilder it is possible to pass in a good quality increment for
>>>> the Weyl sequence. I have code to build the increment that can be added
>>>> to the SeedFactory. I did this months ago so will have to find it and
>>>> create the PR.
>>>>
>>>> - RNG-95: DiscreteUniformSampler
>>>>
>>>> I now have a reference for the alternative algorithm for choosing int
>>>> values from an interval. The code is done but should go after RNG-110 as
>>>> the code uses 5 internal delegates for different algorithms. This would
>>>> be optimised by the changes in RNG-110.
>>>>
>>>> - RNG-109: DiscreteProbabilityCollectionSampler to use an internal
>>>> DiscreteSampler
>>>>
>>>> I have to create a benchmark to compare the AliasMethodSampler against
>>>> the GuideTableSampler to see which is more suitable for a generic
>>>> probability distribution. This should not take long.
>>>>
>>>> - RNG-94: RotateRotateMultiplyXorMultiplyXor
>>>>
>>>> Simple code that is based on the same idea of using an output hash
>>>> function on a Weyl sequence like SplitMix. It is slightly slower but the
>>>> hash function is better and more robust to low complexity increments. So
>>>> we can add it using a seeded increment for the Weyl sequence. This would
>>>> take a day to add the two hash function variants.
>>>>
>>> Everything ready is fine to add before the next release, but equally
>>> fine to add after it (and do another release in 2 months if wanted)
>>> given the host of new features already implemented. :-)
>> I’ll put what I have together in the coming week or so.
>>
>>> Best,
>>> Gilles
>>>
>>>> Maybe for later:
>>>>
>>>> - RNG-90: Improve nextInt(int)
>>>>
>>>> This could use the same algorithm as RNG-95. I have not done the testing
>>>> yet. It also can be done for nextLong(long) which requires a 64-bit
>>>> product multiplication to be computed as a 128-bit result. I have code
>>>> for this but no performance tests.
>>>>
>>>> Not done but...
>>>>
>>>> The PCG family has extended generators: K-dimensionally equidistributed
>>>> or Cryptographic. These have a much larger period and the
>>>> equidistributed ones can be Jumpable.
>>>>
>>>>
>>>> [1] https://github.com/apache/commons-rng/pull/58
>>>>
>>>>
>>>>
>>>>> Regards,
>>>>> Gilles
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Gilles Sadowski <gi...@gmail.com>.
Le mar. 30 juil. 2019 à 22:07, Alex Herbert <al...@gmail.com> a écrit :
>
>
>
> > On 30 Jul 2019, at 19:28, Gilles Sadowski <gi...@gmail.com> wrote:
> >
> > Hi.
> >
> > Le mar. 30 juil. 2019 à 15:38, Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>> a écrit :
> >>
> >> On 30/07/2019 10:56, Gilles Sadowski wrote:
> >>> Hello.
> >>>
> >>> Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
> >>>>
> >>>> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>>>>>> P.S. Thinking of releasing 1.3?
> >>>>>> Not yet. I think there are a few outstanding items [...]
> >>> Anything missing?
> >>
> >> - RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have
> >> a review [1]. I've left this while we finished GSoC phase 2 but it is ready.
> >>
> >> I added factory methods for all samplers. For existing samplers this is
> >> just for consistency. Some however use internal delegates and the
> >> factory method can return the delegate directly which is an advantage.
> >>
> >> One issue to look at is how I handled GaussianSampler and
> >> LogNormalSampler. The samplers can only be shared state samplers if the
> >> input NormalizedGaussianSampler is a shared state sampler. I handled
> >> this with documentation. But this means a downstream user may be passed
> >> a SharedStateContinuousSampler, use it as such and receive an exception
> >> if it was created incorrectly.
> >
> > Could the library create it "incorrectly”?
>
> No. All the NormalizedGaussianSamplers in the library are suitable to pass to the GaussianSampler and LogNormalSampler.
>
> >
> >>
> >> The alternative is two factory methods which must have different names
> >> due to type erasure:
> >>
> >> public static ContinuousSampler of(NormalizedGaussianSampler gaussian,
> >> double scale, double shape);
> >>
> >> public static
> >>     <T extends NormalizedGaussianSampler &
> >> SharedStateSampler<ContinuousSampler>>
> >>     SharedStateContinuousSampler
> >>     ofSharedState(T normalized,
> >>                   double mean,
> >>                   double standardDeviation) {
> >
> > Not nice, at first sight.
> >
> >> So the options are:
> >>
> >> - As current but has the pitfall of throwing exceptions if you do create
> >> a one with something that does not share state (i.e. not a sampler in
> >> the library).
> >
> > IIUC, it answers my question above.
> > I would not consider too much that the interfaces defined in our "client API"
> > module could be implemented by external codes.
> > Even within "Commons", we do not use other components' API…
>
> OK. User beware documentation. If using only our library it will not be an issue.
>
> So I take it that you are fine with the PR?

Yes.
I didn't look at all the details but is that check
---CUT---
        if (!(source.normalized instanceof SharedStateSampler<?>)) {
            throw new UnsupportedOperationException("The underlying
sampler is not a SharedStateSampler");
        }
---CUT---
necessary?

> Just rereading it now and I’m not totally sold on the factory constructors using .of(…). Here’s Joshua Bloch on the matter [1]:
>
> —
> Here are some common names for static factory methods:
>
>         • valueOf—Returns an instance that has, loosely speaking, the same value as its parameters. Such static factories are effectively type-conversion methods.
>
>         • of—A concise alternative to valueOf, popularized by EnumSet (Item 32).
>
>         • getInstance—Returns an instance that is described by the parameters but cannot be said to have the same value. In the case of a singleton, getInstance takes no parameters and returns the sole instance.
>
>         • newInstance—Like getInstance, except that newInstance guarantees that each instance returned is distinct from all others.
>
>         • getType—Like getInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.
>
>         • newType—Like newInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.
> —
>
> The ’newType’ may be appropriate as the factory method is returning an instance of an interface not defined in the class. This is required for some implementations such as the PoissonSampler which returns either a SmallMean or LargeMeanPoissonSampler. So the Bloch naming could be ’newSampler’. But then we have over verbose code using ‘Sampler' twice:
>
> new PoissonSampler(rng, mean)
>
> PoissonSampler.of(rng, mean)
>
> PoissonSampler.newSampler(rng, mean)
>
> IMO the later is too verbose. If we state that the sampler is entirely defined by the input arguments to ‘of’ then it does satisfy "has, loosely speaking, the same value as its parameters”.
>
> So ‘of’ is OK. The only other words I like are ‘from’ or ‘with’:
>
> PoissonSampler.of(rng, mean)
> PoissonSampler.from(rng, mean)
> PoissonSampler.with(rng, mean)
>

"with" seems the most appropriate but has the disadvantage of not
being in Bloch's list.
I would then stick with "of" because "from" looks like the arguments
will be transformed
(and it also is not in the list...).

Gilles

> WDYT?
>
> [1] http://www.informit.com/articles/article.aspx?p=1216151 <http://www.informit.com/articles/article.aspx?p=1216151>
> >
> >> - Another factory method to explicitly create a SharedStateSampler using
> >> a normalised Gaussian SharedStateSampler.
> >>
> >>
> >> A few things that are 90% done:
> >>
> >> - RNG-85: MiddleSquareWeylSequence generator
> >>
> >> This is simple code and now the modifications have been made to the
> >> ProviderBuilder it is possible to pass in a good quality increment for
> >> the Weyl sequence. I have code to build the increment that can be added
> >> to the SeedFactory. I did this months ago so will have to find it and
> >> create the PR.
> >>
> >> - RNG-95: DiscreteUniformSampler
> >>
> >> I now have a reference for the alternative algorithm for choosing int
> >> values from an interval. The code is done but should go after RNG-110 as
> >> the code uses 5 internal delegates for different algorithms. This would
> >> be optimised by the changes in RNG-110.
> >>
> >> - RNG-109: DiscreteProbabilityCollectionSampler to use an internal
> >> DiscreteSampler
> >>
> >> I have to create a benchmark to compare the AliasMethodSampler against
> >> the GuideTableSampler to see which is more suitable for a generic
> >> probability distribution. This should not take long.
> >>
> >> - RNG-94: RotateRotateMultiplyXorMultiplyXor
> >>
> >> Simple code that is based on the same idea of using an output hash
> >> function on a Weyl sequence like SplitMix. It is slightly slower but the
> >> hash function is better and more robust to low complexity increments. So
> >> we can add it using a seeded increment for the Weyl sequence. This would
> >> take a day to add the two hash function variants.
> >>
> >
> > Everything ready is fine to add before the next release, but equally
> > fine to add after it (and do another release in 2 months if wanted)
> > given the host of new features already implemented. :-)
>
> I’ll put what I have together in the coming week or so.
>
> >
> > Best,
> > Gilles
> >
> >> Maybe for later:
> >>
> >> - RNG-90: Improve nextInt(int)
> >>
> >> This could use the same algorithm as RNG-95. I have not done the testing
> >> yet. It also can be done for nextLong(long) which requires a 64-bit
> >> product multiplication to be computed as a 128-bit result. I have code
> >> for this but no performance tests.
> >>
> >> Not done but...
> >>
> >> The PCG family has extended generators: K-dimensionally equidistributed
> >> or Cryptographic. These have a much larger period and the
> >> equidistributed ones can be Jumpable.
> >>
> >>
> >> [1] https://github.com/apache/commons-rng/pull/58
> >>
> >>
> >>
> >>>
> >>> Regards,
> >>> Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Alex Herbert <al...@gmail.com>.

> On 30 Jul 2019, at 19:28, Gilles Sadowski <gi...@gmail.com> wrote:
> 
> Hi.
> 
> Le mar. 30 juil. 2019 à 15:38, Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>> a écrit :
>> 
>> On 30/07/2019 10:56, Gilles Sadowski wrote:
>>> Hello.
>>> 
>>> Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
>>>> 
>>>> On 10/06/2019 15:31, Gilles Sadowski wrote:
>>>>>>> P.S. Thinking of releasing 1.3?
>>>>>> Not yet. I think there are a few outstanding items [...]
>>> Anything missing?
>> 
>> - RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have
>> a review [1]. I've left this while we finished GSoC phase 2 but it is ready.
>> 
>> I added factory methods for all samplers. For existing samplers this is
>> just for consistency. Some however use internal delegates and the
>> factory method can return the delegate directly which is an advantage.
>> 
>> One issue to look at is how I handled GaussianSampler and
>> LogNormalSampler. The samplers can only be shared state samplers if the
>> input NormalizedGaussianSampler is a shared state sampler. I handled
>> this with documentation. But this means a downstream user may be passed
>> a SharedStateContinuousSampler, use it as such and receive an exception
>> if it was created incorrectly.
> 
> Could the library create it "incorrectly”?

No. All the NormalizedGaussianSamplers in the library are suitable to pass to the GaussianSampler and LogNormalSampler.

> 
>> 
>> The alternative is two factory methods which must have different names
>> due to type erasure:
>> 
>> public static ContinuousSampler of(NormalizedGaussianSampler gaussian,
>> double scale, double shape);
>> 
>> public static
>>     <T extends NormalizedGaussianSampler &
>> SharedStateSampler<ContinuousSampler>>
>>     SharedStateContinuousSampler
>>     ofSharedState(T normalized,
>>                   double mean,
>>                   double standardDeviation) {
> 
> Not nice, at first sight.
> 
>> So the options are:
>> 
>> - As current but has the pitfall of throwing exceptions if you do create
>> a one with something that does not share state (i.e. not a sampler in
>> the library).
> 
> IIUC, it answers my question above.
> I would not consider too much that the interfaces defined in our "client API"
> module could be implemented by external codes.
> Even within "Commons", we do not use other components' API…

OK. User beware documentation. If using only our library it will not be an issue.

So I take it that you are fine with the PR?

Just rereading it now and I’m not totally sold on the factory constructors using .of(…). Here’s Joshua Bloch on the matter [1]:

—
Here are some common names for static factory methods:

	• valueOf—Returns an instance that has, loosely speaking, the same value as its parameters. Such static factories are effectively type-conversion methods.

	• of—A concise alternative to valueOf, popularized by EnumSet (Item 32).

	• getInstance—Returns an instance that is described by the parameters but cannot be said to have the same value. In the case of a singleton, getInstance takes no parameters and returns the sole instance.

	• newInstance—Like getInstance, except that newInstance guarantees that each instance returned is distinct from all others.

	• getType—Like getInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.

	• newType—Like newInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method.
—

The ’newType’ may be appropriate as the factory method is returning an instance of an interface not defined in the class. This is required for some implementations such as the PoissonSampler which returns either a SmallMean or LargeMeanPoissonSampler. So the Bloch naming could be ’newSampler’. But then we have over verbose code using ‘Sampler' twice:

new PoissonSampler(rng, mean)

PoissonSampler.of(rng, mean)

PoissonSampler.newSampler(rng, mean)

IMO the later is too verbose. If we state that the sampler is entirely defined by the input arguments to ‘of’ then it does satisfy "has, loosely speaking, the same value as its parameters”.

So ‘of’ is OK. The only other words I like are ‘from’ or ‘with’:

PoissonSampler.of(rng, mean)
PoissonSampler.from(rng, mean)
PoissonSampler.with(rng, mean)

WDYT?

[1] http://www.informit.com/articles/article.aspx?p=1216151 <http://www.informit.com/articles/article.aspx?p=1216151>
> 
>> - Another factory method to explicitly create a SharedStateSampler using
>> a normalised Gaussian SharedStateSampler.
>> 
>> 
>> A few things that are 90% done:
>> 
>> - RNG-85: MiddleSquareWeylSequence generator
>> 
>> This is simple code and now the modifications have been made to the
>> ProviderBuilder it is possible to pass in a good quality increment for
>> the Weyl sequence. I have code to build the increment that can be added
>> to the SeedFactory. I did this months ago so will have to find it and
>> create the PR.
>> 
>> - RNG-95: DiscreteUniformSampler
>> 
>> I now have a reference for the alternative algorithm for choosing int
>> values from an interval. The code is done but should go after RNG-110 as
>> the code uses 5 internal delegates for different algorithms. This would
>> be optimised by the changes in RNG-110.
>> 
>> - RNG-109: DiscreteProbabilityCollectionSampler to use an internal
>> DiscreteSampler
>> 
>> I have to create a benchmark to compare the AliasMethodSampler against
>> the GuideTableSampler to see which is more suitable for a generic
>> probability distribution. This should not take long.
>> 
>> - RNG-94: RotateRotateMultiplyXorMultiplyXor
>> 
>> Simple code that is based on the same idea of using an output hash
>> function on a Weyl sequence like SplitMix. It is slightly slower but the
>> hash function is better and more robust to low complexity increments. So
>> we can add it using a seeded increment for the Weyl sequence. This would
>> take a day to add the two hash function variants.
>> 
> 
> Everything ready is fine to add before the next release, but equally
> fine to add after it (and do another release in 2 months if wanted)
> given the host of new features already implemented. :-)

I’ll put what I have together in the coming week or so.

> 
> Best,
> Gilles
> 
>> Maybe for later:
>> 
>> - RNG-90: Improve nextInt(int)
>> 
>> This could use the same algorithm as RNG-95. I have not done the testing
>> yet. It also can be done for nextLong(long) which requires a 64-bit
>> product multiplication to be computed as a 128-bit result. I have code
>> for this but no performance tests.
>> 
>> Not done but...
>> 
>> The PCG family has extended generators: K-dimensionally equidistributed
>> or Cryptographic. These have a much larger period and the
>> equidistributed ones can be Jumpable.
>> 
>> 
>> [1] https://github.com/apache/commons-rng/pull/58
>> 
>> 
>> 
>>> 
>>> Regards,
>>> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Gilles Sadowski <gi...@gmail.com>.
Hi.

Le mar. 30 juil. 2019 à 15:38, Alex Herbert <al...@gmail.com> a écrit :
>
> On 30/07/2019 10:56, Gilles Sadowski wrote:
> > Hello.
> >
> > Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
> >>
> >> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>>>> P.S. Thinking of releasing 1.3?
> >>>> Not yet. I think there are a few outstanding items [...]
> > Anything missing?
>
> - RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have
> a review [1]. I've left this while we finished GSoC phase 2 but it is ready.
>
> I added factory methods for all samplers. For existing samplers this is
> just for consistency. Some however use internal delegates and the
> factory method can return the delegate directly which is an advantage.
>
> One issue to look at is how I handled GaussianSampler and
> LogNormalSampler. The samplers can only be shared state samplers if the
> input NormalizedGaussianSampler is a shared state sampler. I handled
> this with documentation. But this means a downstream user may be passed
> a SharedStateContinuousSampler, use it as such and receive an exception
> if it was created incorrectly.

Could the library create it "incorrectly"?

>
> The alternative is two factory methods which must have different names
> due to type erasure:
>
> public static ContinuousSampler of(NormalizedGaussianSampler gaussian,
> double scale, double shape);
>
> public static
>      <T extends NormalizedGaussianSampler &
> SharedStateSampler<ContinuousSampler>>
>      SharedStateContinuousSampler
>      ofSharedState(T normalized,
>                    double mean,
>                    double standardDeviation) {

Not nice, at first sight.

> So the options are:
>
> - As current but has the pitfall of throwing exceptions if you do create
> a one with something that does not share state (i.e. not a sampler in
> the library).

IIUC, it answers my question above.
I would not consider too much that the interfaces defined in our "client API"
module could be implemented by external codes.
Even within "Commons", we do not use other components' API...

> - Another factory method to explicitly create a SharedStateSampler using
> a normalised Gaussian SharedStateSampler.
>
>
> A few things that are 90% done:
>
> - RNG-85: MiddleSquareWeylSequence generator
>
> This is simple code and now the modifications have been made to the
> ProviderBuilder it is possible to pass in a good quality increment for
> the Weyl sequence. I have code to build the increment that can be added
> to the SeedFactory. I did this months ago so will have to find it and
> create the PR.
>
> - RNG-95: DiscreteUniformSampler
>
> I now have a reference for the alternative algorithm for choosing int
> values from an interval. The code is done but should go after RNG-110 as
> the code uses 5 internal delegates for different algorithms. This would
> be optimised by the changes in RNG-110.
>
> - RNG-109: DiscreteProbabilityCollectionSampler to use an internal
> DiscreteSampler
>
> I have to create a benchmark to compare the AliasMethodSampler against
> the GuideTableSampler to see which is more suitable for a generic
> probability distribution. This should not take long.
>
> - RNG-94: RotateRotateMultiplyXorMultiplyXor
>
> Simple code that is based on the same idea of using an output hash
> function on a Weyl sequence like SplitMix. It is slightly slower but the
> hash function is better and more robust to low complexity increments. So
> we can add it using a seeded increment for the Weyl sequence. This would
> take a day to add the two hash function variants.
>

Everything ready is fine to add before the next release, but equally
fine to add after it (and do another release in 2 months if wanted)
given the host of new features already implemented. :-)

Best,
Gilles

> Maybe for later:
>
> - RNG-90: Improve nextInt(int)
>
> This could use the same algorithm as RNG-95. I have not done the testing
> yet. It also can be done for nextLong(long) which requires a 64-bit
> product multiplication to be computed as a 128-bit result. I have code
> for this but no performance tests.
>
> Not done but...
>
> The PCG family has extended generators: K-dimensionally equidistributed
> or Cryptographic. These have a much larger period and the
> equidistributed ones can be Jumpable.
>
>
> [1] https://github.com/apache/commons-rng/pull/58
>
>
>
> >
> > Regards,
> > Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [rng] Releasing 1.3

Posted by Alex Herbert <al...@gmail.com>.
On 30/07/2019 10:56, Gilles Sadowski wrote:
> Hello.
>
> Le lun. 10 juin 2019 à 17:17, Alex Herbert <al...@gmail.com> a écrit :
>>
>> On 10/06/2019 15:31, Gilles Sadowski wrote:
>>>>> P.S. Thinking of releasing 1.3?
>>>> Not yet. I think there are a few outstanding items [...]
> Anything missing?

- RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have 
a review [1]. I've left this while we finished GSoC phase 2 but it is ready.

I added factory methods for all samplers. For existing samplers this is 
just for consistency. Some however use internal delegates and the 
factory method can return the delegate directly which is an advantage.

One issue to look at is how I handled GaussianSampler and 
LogNormalSampler. The samplers can only be shared state samplers if the 
input NormalizedGaussianSampler is a shared state sampler. I handled 
this with documentation. But this means a downstream user may be passed 
a SharedStateContinuousSampler, use it as such and receive an exception 
if it was created incorrectly.

The alternative is two factory methods which must have different names 
due to type erasure:

public static ContinuousSampler of(NormalizedGaussianSampler gaussian, 
double scale, double shape);

public static
     <T extends NormalizedGaussianSampler & 
SharedStateSampler<ContinuousSampler>>
     SharedStateContinuousSampler
     ofSharedState(T normalized,
                   double mean,
                   double standardDeviation) {

So the options are:

- As current but has the pitfall of throwing exceptions if you do create 
a one with something that does not share state (i.e. not a sampler in 
the library).
- Another factory method to explicitly create a SharedStateSampler using 
a normalised Gaussian SharedStateSampler.


A few things that are 90% done:

- RNG-85: MiddleSquareWeylSequence generator

This is simple code and now the modifications have been made to the 
ProviderBuilder it is possible to pass in a good quality increment for 
the Weyl sequence. I have code to build the increment that can be added 
to the SeedFactory. I did this months ago so will have to find it and 
create the PR.

- RNG-95: DiscreteUniformSampler

I now have a reference for the alternative algorithm for choosing int 
values from an interval. The code is done but should go after RNG-110 as 
the code uses 5 internal delegates for different algorithms. This would 
be optimised by the changes in RNG-110.

- RNG-109: DiscreteProbabilityCollectionSampler to use an internal 
DiscreteSampler

I have to create a benchmark to compare the AliasMethodSampler against 
the GuideTableSampler to see which is more suitable for a generic 
probability distribution. This should not take long.

- RNG-94: RotateRotateMultiplyXorMultiplyXor

Simple code that is based on the same idea of using an output hash 
function on a Weyl sequence like SplitMix. It is slightly slower but the 
hash function is better and more robust to low complexity increments. So 
we can add it using a seeded increment for the Weyl sequence. This would 
take a day to add the two hash function variants.


Maybe for later:

- RNG-90: Improve nextInt(int)

This could use the same algorithm as RNG-95. I have not done the testing 
yet. It also can be done for nextLong(long) which requires a 64-bit 
product multiplication to be computed as a 128-bit result. I have code 
for this but no performance tests.

Not done but...

The PCG family has extended generators: K-dimensionally equidistributed 
or Cryptographic. These have a much larger period and the 
equidistributed ones can be Jumpable.


[1] https://github.com/apache/commons-rng/pull/58



>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org