You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@harfang.homelinux.org> on 2012/08/04 19:57:12 UTC

[MATH] Test failure in Continuum

Hello.

Referring to this failed test (cf. messages from Continuum):
---CUT---
org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)


It is due to a precondition check while creating the
"UniformIntegerDistribution" instance:
---CUT---
if (lower >= upper) {
    throw new NumberIsTooLargeException(
                       LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
                       lower, upper, false);
}
---CUT---

The test referred to above was using this code (before I changed it use a
"UniformIntegerDistribution" instance):
---CUT---
final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
---CUT---

It is now (after the change):
---CUT---
final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
---CUT---

Thus, AFAIK, the failure did not appear before because there was no
precondition enforcement in "nextInt".

The question is: Was the code in the test correct (in allowing the same
value for both bounds?
 * In the negative, how to change it?
 * The affirmative would mean that the precondition check in
   "UniformIntegerDistribution" should be relaxed to allow equal bounds.
   Does this make sense?
   If so, can we change it now, or is it forbidden in order to stay
   backwards compatible?


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Dennis Hendriks <D....@tue.nl>.
 > What could be done is
 > 1. create a static method in the distribution class
 > 2. have the "sample()" method call that one

+1. I like that the sample implementation for a distribution is in the 
actual distribution class.

Denis


On 08/06/2012 02:30 AM, Gilles Sadowski wrote:
>>>> [...]
>>>> The original code above illustrates the convenience of being able to
>>>> just make direct calls to randomData.nextXxx, which is why this
>>>> class exists ;)
>>> As I wrote in another post, I'm not against the convenience methods. But
>>> IMO, they should be located in a new "DistributionUtils" class.
>>
>> Why?  RandomData is pretty descriptive and exactly what these
>> methods do.  They generate random data.
>
> Fine...
> But can we plan to merge "RandomData" and "RandomDataImpl"?
>
>>> And we should also find a way to remove the code duplication (in the
>>> distribution's "sample()" method and in the corresponding "next..." method).
>>
>> +1 - the implementations can be moved.   When I last looked
>> carefully at this, the distribution methods were delegating to impls
>> in RandomDataImpl.  What we have agreed is to move the impls into
>> the distribution classes for the basic sampling methods.  That
>> should not be too hard to do.  I will do that if no one beats me to it.
>
> I did it in r1363604.
> What still needs to be done is redirect the "next..." to the "sample()"
> method of the appropriate distribution.
> But I had already raised the issue of efficiency: each call to e.g.
>    nextInt(p, q)
> will entail the instantiation of a UniformRealDistribution object.
>
> What could be done is
> 1. create a static method in the distribution class
> 2. have the "sample()" method call that one
>
> ---
> public class UniformRealDistribution extends ... {
>    // ...
>
>    public static int nextInt(RandomGenerator rng,
>                              int a,
>                              int b) {
>        final double u = rng.nextDouble();
>        return u * b + (1 - u) * a;
>    }
>
>    public int sample() {
>      // Here "random", "lower" and "upper" are instance variables.
>      return nextInt(random, lower, upper);
>    }
> }
> ---
>
> And "nextInt" from "RandomDataImpl" would also be redirected to the static
> method in the distribution class:
>
> ---
> import org.apache.commons.math3.distribution.UniformRealDistribution;
>
> public class RandomDataImpl ... {
>    // ...
>
>    public int nextInt(int lower, int upper) {
>      return UniformRealDistribution.nextInt(getRan(), lower, upper);
>    }
> }
> ---
>
> OK?
>
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/5/12 5:30 PM, Gilles Sadowski wrote:
>>>> [...]
>>>> The original code above illustrates the convenience of being able to
>>>> just make direct calls to randomData.nextXxx, which is why this
>>>> class exists ;)
>>> As I wrote in another post, I'm not against the convenience methods. But
>>> IMO, they should be located in a new "DistributionUtils" class.
>> Why?  RandomData is pretty descriptive and exactly what these
>> methods do.  They generate random data.
> Fine...
> But can we plan to merge "RandomData" and "RandomDataImpl"?

Definitely want to do this.  Unfortunately, it is incompatible
change.  I guess we could create a new class named something else,
deprecate both of the above and have RandomDataImpl delegate to the
new class.  How about RandomDataGenerator in the random package as
the new class?
>
>>> And we should also find a way to remove the code duplication (in the
>>> distribution's "sample()" method and in the corresponding "next..." method).
>> +1 - the implementations can be moved.   When I last looked
>> carefully at this, the distribution methods were delegating to impls
>> in RandomDataImpl.  What we have agreed is to move the impls into
>> the distribution classes for the basic sampling methods.  That
>> should not be too hard to do.  I will do that if no one beats me to it.
> I did it in r1363604.
> What still needs to be done is redirect the "next..." to the "sample()"
> method of the appropriate distribution.
> But I had already raised the issue of efficiency: each call to e.g.
>   nextInt(p, q)
> will entail the instantiation of a UniformRealDistribution object.
>
> What could be done is 
> 1. create a static method in the distribution class
> 2. have the "sample()" method call that one
>
> ---
> public class UniformRealDistribution extends ... {
>   // ...
>
>   public static int nextInt(RandomGenerator rng,
>                             int a,
>                             int b) {
>       final double u = rng.nextDouble();
>       return u * b + (1 - u) * a;
>   }
>
>   public int sample() {
>     // Here "random", "lower" and "upper" are instance variables.
>     return nextInt(random, lower, upper);
>   }
> }
> ---
>
> And "nextInt" from "RandomDataImpl" would also be redirected to the static
> method in the distribution class:
>
> ---
> import org.apache.commons.math3.distribution.UniformRealDistribution;
>
> public class RandomDataImpl ... {
>   // ...
>
>   public int nextInt(int lower, int upper) {
>     return UniformRealDistribution.nextInt(getRan(), lower, upper);
>   }
> }
> ---
>
> OK?
>
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
> >> [...]
> >> The original code above illustrates the convenience of being able to
> >> just make direct calls to randomData.nextXxx, which is why this
> >> class exists ;)
> > As I wrote in another post, I'm not against the convenience methods. But
> > IMO, they should be located in a new "DistributionUtils" class.
> 
> Why?  RandomData is pretty descriptive and exactly what these
> methods do.  They generate random data.

Fine...
But can we plan to merge "RandomData" and "RandomDataImpl"?

> > And we should also find a way to remove the code duplication (in the
> > distribution's "sample()" method and in the corresponding "next..." method).
> 
> +1 - the implementations can be moved.   When I last looked
> carefully at this, the distribution methods were delegating to impls
> in RandomDataImpl.  What we have agreed is to move the impls into
> the distribution classes for the basic sampling methods.  That
> should not be too hard to do.  I will do that if no one beats me to it.

I did it in r1363604.
What still needs to be done is redirect the "next..." to the "sample()"
method of the appropriate distribution.
But I had already raised the issue of efficiency: each call to e.g.
  nextInt(p, q)
will entail the instantiation of a UniformRealDistribution object.

What could be done is 
1. create a static method in the distribution class
2. have the "sample()" method call that one

---
public class UniformRealDistribution extends ... {
  // ...

  public static int nextInt(RandomGenerator rng,
                            int a,
                            int b) {
      final double u = rng.nextDouble();
      return u * b + (1 - u) * a;
  }

  public int sample() {
    // Here "random", "lower" and "upper" are instance variables.
    return nextInt(random, lower, upper);
  }
}
---

And "nextInt" from "RandomDataImpl" would also be redirected to the static
method in the distribution class:

---
import org.apache.commons.math3.distribution.UniformRealDistribution;

public class RandomDataImpl ... {
  // ...

  public int nextInt(int lower, int upper) {
    return UniformRealDistribution.nextInt(getRan(), lower, upper);
  }
}
---

OK?


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/5/12 3:49 PM, Gilles Sadowski wrote:
> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>> Hello.
>>>
>>> Referring to this failed test (cf. messages from Continuum):
>>> ---CUT---
>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>
>>>
>>> It is due to a precondition check while creating the
>>> "UniformIntegerDistribution" instance:
>>> ---CUT---
>>> if (lower >= upper) {
>>>     throw new NumberIsTooLargeException(
>>>                        LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>                        lower, upper, false);
>>> }
>>> ---CUT---
>>>
>>> The test referred to above was using this code (before I changed it use a
>>> "UniformIntegerDistribution" instance):
>>> ---CUT---
>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>> ---CUT---
>>>
>>> It is now (after the change):
>>> ---CUT---
>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>> ---CUT---
>>>
>>> Thus, AFAIK, the failure did not appear before because there was no
>>> precondition enforcement in "nextInt".
>>>
>>> The question is: Was the code in the test correct (in allowing the same
>>> value for both bounds?
>>>  * In the negative, how to change it?
>>>  * The affirmative would mean that the precondition check in
>>>    "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>    Does this make sense?
>>>    If so, can we change it now, or is it forbidden in order to stay
>>>    backwards compatible?
>> Your analysis above is correct.  The failure after the change is due
>> to the fact that post-change the distribution is instantiated before
>> the bounds check.  I changed the test to fix this.
> Thanks.
>
>>  Both the
>> randomData nextInt and the UniformIntegerDistribution constructor
>> now forbid the degenerate case where there is only one point in the
>> domain.  In retrospect, I guess it would have probably been better
>> to allow this degenerate case.  Unfortunately, this would be an
>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>
>> The original code above illustrates the convenience of being able to
>> just make direct calls to randomData.nextXxx, which is why this
>> class exists ;)
> As I wrote in another post, I'm not against the convenience methods. But
> IMO, they should be located in a new "DistributionUtils" class.

Why?  RandomData is pretty descriptive and exactly what these
methods do.  They generate random data.
> And we should also find a way to remove the code duplication (in the
> distribution's "sample()" method and in the corresponding "next..." method).

+1 - the implementations can be moved.   When I last looked
carefully at this, the distribution methods were delegating to impls
in RandomDataImpl.  What we have agreed is to move the impls into
the distribution classes for the basic sampling methods.  That
should not be too hard to do.  I will do that if no one beats me to it.

Phil

I a
>
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/6/12 2:28 PM, Gilles Sadowski wrote:
> On Mon, Aug 06, 2012 at 12:44:24PM -0700, Phil Steitz wrote:
>> On 8/6/12 11:41 AM, Gilles Sadowski wrote:
>>>>> [...]
>>>>>>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
>>>>>>>
>>>>>> All have the method now, but the impls delegate to RandomDataImpl.
>>>>> They do not.
>>>> The method is in the interface.  There is a default, inversion-based implementation in the abstract base class.  Not all distributions override the default impl.
>>>> The ones that do delegate to the specialized methods in RandomDataImpl
>>>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>> No; as indicated before (cf. message referred to below), the distribution
>>> classes do not delegate anymore to "RandomDataImpl", since revision 1363604
>>> (resolution of MATH-764 and MATH-823).
>> Sorry, my mistake.  I see the code has been duplicated.  I will see
>> if I can modify the methods in RandomDataImpl to use the impls in
>> the distribution classes.  Should be possible by creating instances
>> using the configured RandomGenerator.
> Yes, that was what I also proposed when we started talking about
> transferring the (sampling) code from RAndomDataImpl over to the "sample()"
> methods.
>
>>  See more below.
>>>>>> In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
>>>>> Thanks for reading fully this message
>>>>>  http://markmail.org/message/5fpmwyiiw2xq4o3q
>>>> I don't get the desire to make sample() static.
>>> There is no such desire stated in that message.
>>> The proposed static method is a "helper" that will take everything needed as
>>> arguments. It will be defined in the appropriate distribution class, to be
>>> called both by "sample()" from that class and by the convenience methods in
>>> "RandomDataImpl".
>> OK, I now think I get what you mean.  Non-static methods of
>> "RandomDataImpl" could use static methods from the distribution
>> classes, passing the configured RandomGenerator as an argument.   Is
>> that what you mean?
> Yes! :-)
>
>>  I don't know how much direct usage the static
>> methods in the distribution classes would get; since most actual
>> examples involve generating sequences using the same generator
>> repeatedly (so sample() is more natural).
> I totally agree. They would exist only for the sake of not duplicating the
> sampling code.
>
>>  I guess "RandomDataImpl"
>> would then not have to instantiate distributions, but there is not
>> much overhead to creating the ones we have, so not sure it is worth
>> the effort to add all of these methods.
> Fine then. I'd also prefer not to create new public methods just for the
> sake of a convenience class.
> I.e. we could mention (in "RandomDataImpl") that the convenience comes at a
> slight efficiency cost. [And that for heavy usage of sampling, one should
> use the distribution classes directly.]

OK, sorry to be a little thick on this.  I think I can do this.

Thanks!

Phil
>
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Mon, Aug 06, 2012 at 12:44:24PM -0700, Phil Steitz wrote:
> On 8/6/12 11:41 AM, Gilles Sadowski wrote:
> >>> [...]
> >>>>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
> >>>>>
> >>>> All have the method now, but the impls delegate to RandomDataImpl.
> >>> They do not.
> >> The method is in the interface.  There is a default, inversion-based implementation in the abstract base class.  Not all distributions override the default impl.
> >> The ones that do delegate to the specialized methods in RandomDataImpl
> >                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > No; as indicated before (cf. message referred to below), the distribution
> > classes do not delegate anymore to "RandomDataImpl", since revision 1363604
> > (resolution of MATH-764 and MATH-823).
> 
> Sorry, my mistake.  I see the code has been duplicated.  I will see
> if I can modify the methods in RandomDataImpl to use the impls in
> the distribution classes.  Should be possible by creating instances
> using the configured RandomGenerator.

Yes, that was what I also proposed when we started talking about
transferring the (sampling) code from RAndomDataImpl over to the "sample()"
methods.

>  See more below.
> >
> >>>> In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
> >>> Thanks for reading fully this message
> >>>  http://markmail.org/message/5fpmwyiiw2xq4o3q
> >> I don't get the desire to make sample() static.
> > There is no such desire stated in that message.
> > The proposed static method is a "helper" that will take everything needed as
> > arguments. It will be defined in the appropriate distribution class, to be
> > called both by "sample()" from that class and by the convenience methods in
> > "RandomDataImpl".
> 
> OK, I now think I get what you mean.  Non-static methods of
> "RandomDataImpl" could use static methods from the distribution
> classes, passing the configured RandomGenerator as an argument.   Is
> that what you mean?

Yes! :-)

>  I don't know how much direct usage the static
> methods in the distribution classes would get; since most actual
> examples involve generating sequences using the same generator
> repeatedly (so sample() is more natural).

I totally agree. They would exist only for the sake of not duplicating the
sampling code.

>  I guess "RandomDataImpl"
> would then not have to instantiate distributions, but there is not
> much overhead to creating the ones we have, so not sure it is worth
> the effort to add all of these methods.

Fine then. I'd also prefer not to create new public methods just for the
sake of a convenience class.
I.e. we could mention (in "RandomDataImpl") that the convenience comes at a
slight efficiency cost. [And that for heavy usage of sampling, one should
use the distribution classes directly.]


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/6/12 11:41 AM, Gilles Sadowski wrote:
>>> [...]
>>>>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
>>>>>
>>>> All have the method now, but the impls delegate to RandomDataImpl.
>>> They do not.
>> The method is in the interface.  There is a default, inversion-based implementation in the abstract base class.  Not all distributions override the default impl.
>> The ones that do delegate to the specialized methods in RandomDataImpl
>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> No; as indicated before (cf. message referred to below), the distribution
> classes do not delegate anymore to "RandomDataImpl", since revision 1363604
> (resolution of MATH-764 and MATH-823).

Sorry, my mistake.  I see the code has been duplicated.  I will see
if I can modify the methods in RandomDataImpl to use the impls in
the distribution classes.  Should be possible by creating instances
using the configured RandomGenerator.  See more below.
>
>>>> In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
>>> Thanks for reading fully this message
>>>  http://markmail.org/message/5fpmwyiiw2xq4o3q
>> I don't get the desire to make sample() static.
> There is no such desire stated in that message.
> The proposed static method is a "helper" that will take everything needed as
> arguments. It will be defined in the appropriate distribution class, to be
> called both by "sample()" from that class and by the convenience methods in
> "RandomDataImpl".

OK, I now think I get what you mean.  Non-static methods of
"RandomDataImpl" could use static methods from the distribution
classes, passing the configured RandomGenerator as an argument.   Is
that what you mean?  I don't know how much direct usage the static
methods in the distribution classes would get; since most actual
examples involve generating sequences using the same generator
repeatedly (so sample() is more natural).  I guess "RandomDataImpl"
would then not have to instantiate distributions, but there is not
much overhead to creating the ones we have, so not sure it is worth
the effort to add all of these methods.

Phil
>
>> [...]
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
> > [...]
> >>> 
> >>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
> >>> 
> >> 
> >> All have the method now, but the impls delegate to RandomDataImpl.
> > 
> > They do not.
> 
> The method is in the interface.  There is a default, inversion-based implementation in the abstract base class.  Not all distributions override the default impl.
> The ones that do delegate to the specialized methods in RandomDataImpl.
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

No; as indicated before (cf. message referred to below), the distribution
classes do not delegate anymore to "RandomDataImpl", since revision 1363604
(resolution of MATH-764 and MATH-823).

> 
> > 
> >> In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
> > 
> > Thanks for reading fully this message
> >  http://markmail.org/message/5fpmwyiiw2xq4o3q
> 
> I don't get the desire to make sample() static.

There is no such desire stated in that message.
The proposed static method is a "helper" that will take everything needed as
arguments. It will be defined in the appropriate distribution class, to be
called both by "sample()" from that class and by the convenience methods in
"RandomDataImpl".

> [...]


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.



On Aug 6, 2012, at 6:08 AM, Gilles Sadowski <gi...@harfang.homelinux.org> wrote:

> On Mon, Aug 06, 2012 at 05:48:14AM -0700, Phil Steitz wrote:
>> 
>> 
>> 
>> 
>> On Aug 5, 2012, at 11:21 PM, Dennis Hendriks <D....@tue.nl> wrote:
>> 
>>> See below.
>>> 
>>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>>> Hello.
>>>>>> 
>>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>>> ---CUT---
>>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>>>>>   at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>>   at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>>   at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>>   at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>> 
>>>>>> 
>>>>>> It is due to a precondition check while creating the
>>>>>> "UniformIntegerDistribution" instance:
>>>>>> ---CUT---
>>>>>> if (lower>= upper) {
>>>>>>    throw new NumberIsTooLargeException(
>>>>>>                       LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>>                       lower, upper, false);
>>>>>> }
>>>>>> ---CUT---
>>>>>> 
>>>>>> The test referred to above was using this code (before I changed it use a
>>>>>> "UniformIntegerDistribution" instance):
>>>>>> ---CUT---
>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>>>>> ---CUT---
>>>>>> 
>>>>>> It is now (after the change):
>>>>>> ---CUT---
>>>>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>>>>> ---CUT---
>>>>>> 
>>>>>> Thus, AFAIK, the failure did not appear before because there was no
>>>>>> precondition enforcement in "nextInt".
>>>>>> 
>>>>>> The question is: Was the code in the test correct (in allowing the same
>>>>>> value for both bounds?
>>>>>> * In the negative, how to change it?
>>>>>> * The affirmative would mean that the precondition check in
>>>>>>   "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>>>>   Does this make sense?
>>>>>>   If so, can we change it now, or is it forbidden in order to stay
>>>>>>   backwards compatible?
>>>>> 
>>>>> Your analysis above is correct.  The failure after the change is due
>>>>> to the fact that post-change the distribution is instantiated before
>>>>> the bounds check.  I changed the test to fix this.
>>>> 
>>>> Thanks.
>>>> 
>>>>> Both the
>>>>> randomData nextInt and the UniformIntegerDistribution constructor
>>>>> now forbid the degenerate case where there is only one point in the
>>>>> domain.  In retrospect, I guess it would have probably been better
>>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>>>> 
>>>>> The original code above illustrates the convenience of being able to
>>>>> just make direct calls to randomData.nextXxx, which is why this
>>>>> class exists ;)
>>>> 
>>>> As I wrote in another post, I'm not against the convenience methods. But
>>>> IMO, they should be located in a new "DistributionUtils" class.
>>>> And we should also find a way to remove the code duplication (in the
>>>> distribution's "sample()" method and in the corresponding "next..." method).
>>>> 
>>> 
>>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
>>> 
>> 
>> All have the method now, but the impls delegate to RandomDataImpl.
> 
> They do not.

The method is in the interface.  There is a default, inversion-based implementation in the abstract base class.  Not all distributions override the default impl.  The ones that do delegate to the specialized methods in RandomDataImpl.

> 
>> In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
> 
> Thanks for reading fully this message
>  http://markmail.org/message/5fpmwyiiw2xq4o3q

I don't get the desire to make sample() static.  It uses the parameters of the distribution as well as the random generator.  Generally, repeated sampling calls will want to use the same generator, so it is more convenient (both here and in RandomData) to use a RandomGenerator provided at construction time.



> Gill
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Mon, Aug 06, 2012 at 05:48:14AM -0700, Phil Steitz wrote:
> 
> 
> 
> 
> On Aug 5, 2012, at 11:21 PM, Dennis Hendriks <D....@tue.nl> wrote:
> 
> > See below.
> > 
> > On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
> >> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
> >>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
> >>>> Hello.
> >>>> 
> >>>> Referring to this failed test (cf. messages from Continuum):
> >>>> ---CUT---
> >>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
> >>>>    at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
> >>>>    at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
> >>>>    at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
> >>>>    at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
> >>>> 
> >>>> 
> >>>> It is due to a precondition check while creating the
> >>>> "UniformIntegerDistribution" instance:
> >>>> ---CUT---
> >>>> if (lower>= upper) {
> >>>>     throw new NumberIsTooLargeException(
> >>>>                        LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
> >>>>                        lower, upper, false);
> >>>> }
> >>>> ---CUT---
> >>>> 
> >>>> The test referred to above was using this code (before I changed it use a
> >>>> "UniformIntegerDistribution" instance):
> >>>> ---CUT---
> >>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
> >>>> ---CUT---
> >>>> 
> >>>> It is now (after the change):
> >>>> ---CUT---
> >>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
> >>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
> >>>> ---CUT---
> >>>> 
> >>>> Thus, AFAIK, the failure did not appear before because there was no
> >>>> precondition enforcement in "nextInt".
> >>>> 
> >>>> The question is: Was the code in the test correct (in allowing the same
> >>>> value for both bounds?
> >>>>  * In the negative, how to change it?
> >>>>  * The affirmative would mean that the precondition check in
> >>>>    "UniformIntegerDistribution" should be relaxed to allow equal bounds.
> >>>>    Does this make sense?
> >>>>    If so, can we change it now, or is it forbidden in order to stay
> >>>>    backwards compatible?
> >>> 
> >>> Your analysis above is correct.  The failure after the change is due
> >>> to the fact that post-change the distribution is instantiated before
> >>> the bounds check.  I changed the test to fix this.
> >> 
> >> Thanks.
> >> 
> >>>  Both the
> >>> randomData nextInt and the UniformIntegerDistribution constructor
> >>> now forbid the degenerate case where there is only one point in the
> >>> domain.  In retrospect, I guess it would have probably been better
> >>> to allow this degenerate case.  Unfortunately, this would be an
> >>> incompatible change, so will have to wait until 4.0 if we want to do it.
> >>> 
> >>> The original code above illustrates the convenience of being able to
> >>> just make direct calls to randomData.nextXxx, which is why this
> >>> class exists ;)
> >> 
> >> As I wrote in another post, I'm not against the convenience methods. But
> >> IMO, they should be located in a new "DistributionUtils" class.
> >> And we should also find a way to remove the code duplication (in the
> >> distribution's "sample()" method and in the corresponding "next..." method).
> >> 
> > 
> > The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
> > 
> 
> All have the method now, but the impls delegate to RandomDataImpl.

They do not.

>  In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.

Thanks for reading fully this message
  http://markmail.org/message/5fpmwyiiw2xq4o3q


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/7/12 9:36 AM, Phil Steitz wrote:
> On 8/6/12 11:16 PM, Dennis Hendriks wrote:
>> See below.
>>
>> On 08/06/2012 05:29 PM, Phil Steitz wrote:
>>>
>>>
>>>
>>> On Aug 6, 2012, at 6:06 AM, Dennis Hendriks<D....@tue.nl> 
>>> wrote:
>>>
>>>> See below.
>>>>
>>>> Dennis
>>>>
>>>>
>>>> On 08/06/2012 02:48 PM, Phil Steitz wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Aug 5, 2012, at 11:21 PM, Dennis
>>>>> Hendriks<D....@tue.nl>   wrote:
>>>>>
>>>>>> See below.
>>>>>>
>>>>>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>>>>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>>>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>>>>>> Hello.
>>>>>>>>>
>>>>>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>>>>>> ---CUT---
>>>>>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException:
>>>>>>>>> lower bound (65) must be strictly less than upper bound (65)
>>>>>>>>>     at
>>>>>>>>> org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>>>>>     at
>>>>>>>>> org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>>>>>     at
>>>>>>>>> org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>>>>>     at
>>>>>>>>> org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It is due to a precondition check while creating the
>>>>>>>>> "UniformIntegerDistribution" instance:
>>>>>>>>> ---CUT---
>>>>>>>>> if (lower>= upper) {
>>>>>>>>>      throw new NumberIsTooLargeException(
>>>>>>>>>                        
>>>>>>>>> LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>>>>>                         lower, upper, false);
>>>>>>>>> }
>>>>>>>>> ---CUT---
>>>>>>>>>
>>>>>>>>> The test referred to above was using this code (before I
>>>>>>>>> changed it use a
>>>>>>>>> "UniformIntegerDistribution" instance):
>>>>>>>>> ---CUT---
>>>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1
>>>>>>>>> : randomData.nextInt(cur, length - 1);
>>>>>>>>> ---CUT---
>>>>>>>>>
>>>>>>>>> It is now (after the change):
>>>>>>>>> ---CUT---
>>>>>>>>> final IntegerDistribution partitionPoint = new
>>>>>>>>> UniformIntegerDistribution(cur, length - 1);
>>>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1
>>>>>>>>> : partitionPoint.sample();
>>>>>>>>> ---CUT---
>>>>>>>>>
>>>>>>>>> Thus, AFAIK, the failure did not appear before because
>>>>>>>>> there was no
>>>>>>>>> precondition enforcement in "nextInt".
>>>>>>>>>
>>>>>>>>> The question is: Was the code in the test correct (in
>>>>>>>>> allowing the same
>>>>>>>>> value for both bounds?
>>>>>>>>>   * In the negative, how to change it?
>>>>>>>>>   * The affirmative would mean that the precondition check in
>>>>>>>>>     "UniformIntegerDistribution" should be relaxed to allow
>>>>>>>>> equal bounds.
>>>>>>>>>     Does this make sense?
>>>>>>>>>     If so, can we change it now, or is it forbidden in
>>>>>>>>> order to stay
>>>>>>>>>     backwards compatible?
>>>>>>>> Your analysis above is correct.  The failure after the
>>>>>>>> change is due
>>>>>>>> to the fact that post-change the distribution is
>>>>>>>> instantiated before
>>>>>>>> the bounds check.  I changed the test to fix this.
>>>>>>> Thanks.
>>>>>>>
>>>>>>>>   Both the
>>>>>>>> randomData nextInt and the UniformIntegerDistribution
>>>>>>>> constructor
>>>>>>>> now forbid the degenerate case where there is only one point
>>>>>>>> in the
>>>>>>>> domain.  In retrospect, I guess it would have probably been
>>>>>>>> better
>>>>>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>>>>>> incompatible change, so will have to wait until 4.0 if we
>>>>>>>> want to do it.
>>>>>>>>
>>>>>>>> The original code above illustrates the convenience of being
>>>>>>>> able to
>>>>>>>> just make direct calls to randomData.nextXxx, which is why this
>>>>>>>> class exists ;)
>>>>>>> As I wrote in another post, I'm not against the convenience
>>>>>>> methods. But
>>>>>>> IMO, they should be located in a new "DistributionUtils" class.
>>>>>>> And we should also find a way to remove the code duplication
>>>>>>> (in the
>>>>>>> distribution's "sample()" method and in the corresponding
>>>>>>> "next..." method).
>>>>>>>
>>>>>> The RandomData class (or whatever it would be called) does
>>>>>> indeed seem useful. If we plan to keep it, we should probably
>>>>>> make sure that there is a sample/next/... method in that class
>>>>>> for EVERY distribution, as some of them are missing, if I
>>>>>> remember correctly. Perhaps this is a separate issue though?
>>>>>>
>>>>> All have the method now, but the impls delegate to
>>>>> RandomDataImpl.  In some cases, there is nothing better
>>>>> implemented than just inversion, provided by the default
>>>>> inversion sampler.  That is OK.  What we need to do is just
>>>>> move the implementations of the default and specialized
>>>>> samplers to the actual distribution classes.  These can't be
>>>>> static, as they use the RamdomData instance.  I will take care
>>>>> of this.
>>>> I'm not sure if I made myself clear. I meant to say that not all
>>>> distributions have a corresponding nextX method in
>>>> RandomData(Impl). What I propose is to make sure that for every
>>>> distribution class, there is a corresponding method in
>>>> RandomData(Impl) to make sure that RandomData(Impl) is actually
>>>> a substitute for using the distributions.
>>> I get it now, but I don't think I agree.  RandomData is meant to
>>> be a general-purpose  class used for generating random data with
>>> commonly desired characteristics, like coming from commonly used
>>> distributions such as Poisson, Gaussian, etc.  This class
>>> predates the sample() method that has been added to *all*
>>> distributions.  The default implementation of sampling for real
>>> distributions (inversion-based sampling)  has no dependency on
>>> RandomDataImpl, but the specialized implementations (impls that
>>> are better than inversion) for some distributions still live
>>> there.  Here is what I would like to do:
>>>
>>> 1. Move the specialized implementations from RandomDataImpl to
>>> the distributions that they sample from.
>>>
>>> 2.  Rename RandomDataImpl to merge the impl and the interface
>>>
>>> I think we all agree on 1.  Regarding 2, I would personally
>>> prefer to leave the represented distributions as is, sticking
>>> with just the most commonly used distributions, having the
>>> methods accept parameters, but maintaining generators as instance
>>> variables so generator state can be maintained between calls
>>> (like sample() and the existing RandomDataImpl behave now).  If
>>> others feel strongly that every distribution should be
>>> represented, I am OK with that but would see it as clutter in the
>>> RD replacement.
>> I definitely do agree on 1. I also agree on 2 (the rename/merge).
>> However, I thought the goal of having RandomData is to be convenient:
>>
>>  - Distributions require a separate instance per distributions.
>> RandomData requires only a single instance.
>>
>>  - Distributions require a separate instance per combination of
>> parameters. RandomData allows providing the parameters when one
>> actually asks for samples.
>>
>> If so, then I don't see why that would be the case for only some
>> of the distributions, and not for others...? Am I missing
>> something here?
> Only that adding every distribution would clutter the class, IMO. 
> As I said above, RandomData was originally intended to be just a
> sort of souped-up Random, making the underlying PRNG pluggable and
> supporting more kinds of random data generation than what Random
> provides.  If you feel strongly that there are lots of practical
> applications for generating random data from, say Weibull or
> ChiSquare distributions, I am fine adding them all to the RandomData
> replacement.  The rationale for limiting is that it makes it a
> little easier to find the commonly used ones (exponential, poissson,
> gaussian).  If others disagree, I am OK adding them all (as I see
> they have now been added to RandomDataImpl, but not the interface).

In r1375192, I created RandomDataGenerator to replace RandomDataImpl
and included support for all distributions.  I also changed the
versions that were just using nextInversionDeviate to actually use
the configured RandomGenerator.

Phil
>
> Phil
>> Dennis
>>
>>
>>> Phil
>>>
>>>
>>>> Obviously, the implementation should be only in the
>>>> distributions OR in the RandomDataImpl. I agree that in the
>>>> distributions would be better. I like the static methods in
>>>> distributions idea.
>>>>
>>>>> Phil
>>>>>
>>>>>>> Gilles
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/6/12 11:16 PM, Dennis Hendriks wrote:
> See below.
>
> On 08/06/2012 05:29 PM, Phil Steitz wrote:
>>
>>
>>
>>
>> On Aug 6, 2012, at 6:06 AM, Dennis Hendriks<D....@tue.nl> 
>> wrote:
>>
>>> See below.
>>>
>>> Dennis
>>>
>>>
>>> On 08/06/2012 02:48 PM, Phil Steitz wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On Aug 5, 2012, at 11:21 PM, Dennis
>>>> Hendriks<D....@tue.nl>   wrote:
>>>>
>>>>> See below.
>>>>>
>>>>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>>>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>>>>> Hello.
>>>>>>>>
>>>>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>>>>> ---CUT---
>>>>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException:
>>>>>>>> lower bound (65) must be strictly less than upper bound (65)
>>>>>>>>     at
>>>>>>>> org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>>>>     at
>>>>>>>> org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>>>>     at
>>>>>>>> org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>>>>     at
>>>>>>>> org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>>>>
>>>>>>>>
>>>>>>>> It is due to a precondition check while creating the
>>>>>>>> "UniformIntegerDistribution" instance:
>>>>>>>> ---CUT---
>>>>>>>> if (lower>= upper) {
>>>>>>>>      throw new NumberIsTooLargeException(
>>>>>>>>                        
>>>>>>>> LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>>>>                         lower, upper, false);
>>>>>>>> }
>>>>>>>> ---CUT---
>>>>>>>>
>>>>>>>> The test referred to above was using this code (before I
>>>>>>>> changed it use a
>>>>>>>> "UniformIntegerDistribution" instance):
>>>>>>>> ---CUT---
>>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1
>>>>>>>> : randomData.nextInt(cur, length - 1);
>>>>>>>> ---CUT---
>>>>>>>>
>>>>>>>> It is now (after the change):
>>>>>>>> ---CUT---
>>>>>>>> final IntegerDistribution partitionPoint = new
>>>>>>>> UniformIntegerDistribution(cur, length - 1);
>>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1
>>>>>>>> : partitionPoint.sample();
>>>>>>>> ---CUT---
>>>>>>>>
>>>>>>>> Thus, AFAIK, the failure did not appear before because
>>>>>>>> there was no
>>>>>>>> precondition enforcement in "nextInt".
>>>>>>>>
>>>>>>>> The question is: Was the code in the test correct (in
>>>>>>>> allowing the same
>>>>>>>> value for both bounds?
>>>>>>>>   * In the negative, how to change it?
>>>>>>>>   * The affirmative would mean that the precondition check in
>>>>>>>>     "UniformIntegerDistribution" should be relaxed to allow
>>>>>>>> equal bounds.
>>>>>>>>     Does this make sense?
>>>>>>>>     If so, can we change it now, or is it forbidden in
>>>>>>>> order to stay
>>>>>>>>     backwards compatible?
>>>>>>>
>>>>>>> Your analysis above is correct.  The failure after the
>>>>>>> change is due
>>>>>>> to the fact that post-change the distribution is
>>>>>>> instantiated before
>>>>>>> the bounds check.  I changed the test to fix this.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>   Both the
>>>>>>> randomData nextInt and the UniformIntegerDistribution
>>>>>>> constructor
>>>>>>> now forbid the degenerate case where there is only one point
>>>>>>> in the
>>>>>>> domain.  In retrospect, I guess it would have probably been
>>>>>>> better
>>>>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>>>>> incompatible change, so will have to wait until 4.0 if we
>>>>>>> want to do it.
>>>>>>>
>>>>>>> The original code above illustrates the convenience of being
>>>>>>> able to
>>>>>>> just make direct calls to randomData.nextXxx, which is why this
>>>>>>> class exists ;)
>>>>>>
>>>>>> As I wrote in another post, I'm not against the convenience
>>>>>> methods. But
>>>>>> IMO, they should be located in a new "DistributionUtils" class.
>>>>>> And we should also find a way to remove the code duplication
>>>>>> (in the
>>>>>> distribution's "sample()" method and in the corresponding
>>>>>> "next..." method).
>>>>>>
>>>>>
>>>>> The RandomData class (or whatever it would be called) does
>>>>> indeed seem useful. If we plan to keep it, we should probably
>>>>> make sure that there is a sample/next/... method in that class
>>>>> for EVERY distribution, as some of them are missing, if I
>>>>> remember correctly. Perhaps this is a separate issue though?
>>>>>
>>>>
>>>> All have the method now, but the impls delegate to
>>>> RandomDataImpl.  In some cases, there is nothing better
>>>> implemented than just inversion, provided by the default
>>>> inversion sampler.  That is OK.  What we need to do is just
>>>> move the implementations of the default and specialized
>>>> samplers to the actual distribution classes.  These can't be
>>>> static, as they use the RamdomData instance.  I will take care
>>>> of this.
>>>
>>> I'm not sure if I made myself clear. I meant to say that not all
>>> distributions have a corresponding nextX method in
>>> RandomData(Impl). What I propose is to make sure that for every
>>> distribution class, there is a corresponding method in
>>> RandomData(Impl) to make sure that RandomData(Impl) is actually
>>> a substitute for using the distributions.
>>
>> I get it now, but I don't think I agree.  RandomData is meant to
>> be a general-purpose  class used for generating random data with
>> commonly desired characteristics, like coming from commonly used
>> distributions such as Poisson, Gaussian, etc.  This class
>> predates the sample() method that has been added to *all*
>> distributions.  The default implementation of sampling for real
>> distributions (inversion-based sampling)  has no dependency on
>> RandomDataImpl, but the specialized implementations (impls that
>> are better than inversion) for some distributions still live
>> there.  Here is what I would like to do:
>>
>> 1. Move the specialized implementations from RandomDataImpl to
>> the distributions that they sample from.
>>
>> 2.  Rename RandomDataImpl to merge the impl and the interface
>>
>> I think we all agree on 1.  Regarding 2, I would personally
>> prefer to leave the represented distributions as is, sticking
>> with just the most commonly used distributions, having the
>> methods accept parameters, but maintaining generators as instance
>> variables so generator state can be maintained between calls
>> (like sample() and the existing RandomDataImpl behave now).  If
>> others feel strongly that every distribution should be
>> represented, I am OK with that but would see it as clutter in the
>> RD replacement.
>
> I definitely do agree on 1. I also agree on 2 (the rename/merge).
> However, I thought the goal of having RandomData is to be convenient:
>
>  - Distributions require a separate instance per distributions.
> RandomData requires only a single instance.
>
>  - Distributions require a separate instance per combination of
> parameters. RandomData allows providing the parameters when one
> actually asks for samples.
>
> If so, then I don't see why that would be the case for only some
> of the distributions, and not for others...? Am I missing
> something here?

Only that adding every distribution would clutter the class, IMO. 
As I said above, RandomData was originally intended to be just a
sort of souped-up Random, making the underlying PRNG pluggable and
supporting more kinds of random data generation than what Random
provides.  If you feel strongly that there are lots of practical
applications for generating random data from, say Weibull or
ChiSquare distributions, I am fine adding them all to the RandomData
replacement.  The rationale for limiting is that it makes it a
little easier to find the commonly used ones (exponential, poissson,
gaussian).  If others disagree, I am OK adding them all (as I see
they have now been added to RandomDataImpl, but not the interface).

Phil
>
> Dennis
>
>
>> Phil
>>
>>
>>>
>>> Obviously, the implementation should be only in the
>>> distributions OR in the RandomDataImpl. I agree that in the
>>> distributions would be better. I like the static methods in
>>> distributions idea.
>>>
>>>> Phil
>>>>
>>>>>>
>>>>>> Gilles
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Dennis Hendriks <D....@tue.nl>.
See below.

On 08/06/2012 05:29 PM, Phil Steitz wrote:
>
>
>
>
> On Aug 6, 2012, at 6:06 AM, Dennis Hendriks<D....@tue.nl>  wrote:
>
>> See below.
>>
>> Dennis
>>
>>
>> On 08/06/2012 02:48 PM, Phil Steitz wrote:
>>>
>>>
>>>
>>>
>>> On Aug 5, 2012, at 11:21 PM, Dennis Hendriks<D....@tue.nl>   wrote:
>>>
>>>> See below.
>>>>
>>>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>>>> Hello.
>>>>>>>
>>>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>>>> ---CUT---
>>>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>>>>>>     at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>>>     at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>>>     at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>>>     at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>>>
>>>>>>>
>>>>>>> It is due to a precondition check while creating the
>>>>>>> "UniformIntegerDistribution" instance:
>>>>>>> ---CUT---
>>>>>>> if (lower>= upper) {
>>>>>>>      throw new NumberIsTooLargeException(
>>>>>>>                         LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>>>                         lower, upper, false);
>>>>>>> }
>>>>>>> ---CUT---
>>>>>>>
>>>>>>> The test referred to above was using this code (before I changed it use a
>>>>>>> "UniformIntegerDistribution" instance):
>>>>>>> ---CUT---
>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>>>>>> ---CUT---
>>>>>>>
>>>>>>> It is now (after the change):
>>>>>>> ---CUT---
>>>>>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>>>>>> ---CUT---
>>>>>>>
>>>>>>> Thus, AFAIK, the failure did not appear before because there was no
>>>>>>> precondition enforcement in "nextInt".
>>>>>>>
>>>>>>> The question is: Was the code in the test correct (in allowing the same
>>>>>>> value for both bounds?
>>>>>>>   * In the negative, how to change it?
>>>>>>>   * The affirmative would mean that the precondition check in
>>>>>>>     "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>>>>>     Does this make sense?
>>>>>>>     If so, can we change it now, or is it forbidden in order to stay
>>>>>>>     backwards compatible?
>>>>>>
>>>>>> Your analysis above is correct.  The failure after the change is due
>>>>>> to the fact that post-change the distribution is instantiated before
>>>>>> the bounds check.  I changed the test to fix this.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>   Both the
>>>>>> randomData nextInt and the UniformIntegerDistribution constructor
>>>>>> now forbid the degenerate case where there is only one point in the
>>>>>> domain.  In retrospect, I guess it would have probably been better
>>>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>>>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>>>>>
>>>>>> The original code above illustrates the convenience of being able to
>>>>>> just make direct calls to randomData.nextXxx, which is why this
>>>>>> class exists ;)
>>>>>
>>>>> As I wrote in another post, I'm not against the convenience methods. But
>>>>> IMO, they should be located in a new "DistributionUtils" class.
>>>>> And we should also find a way to remove the code duplication (in the
>>>>> distribution's "sample()" method and in the corresponding "next..." method).
>>>>>
>>>>
>>>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
>>>>
>>>
>>> All have the method now, but the impls delegate to RandomDataImpl.  In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.
>>
>> I'm not sure if I made myself clear. I meant to say that not all distributions have a corresponding nextX method in RandomData(Impl). What I propose is to make sure that for every distribution class, there is a corresponding method in RandomData(Impl) to make sure that RandomData(Impl) is actually a substitute for using the distributions.
>
> I get it now, but I don't think I agree.  RandomData is meant to be a general-purpose  class used for generating random data with commonly desired characteristics, like coming from commonly used distributions such as Poisson, Gaussian, etc.  This class predates the sample() method that has been added to *all* distributions.  The default implementation of sampling for real distributions (inversion-based sampling)  has no dependency on RandomDataImpl, but the specialized implementations (impls that are better than inversion) for some distributions still live there.  Here is what I would like to do:
>
> 1. Move the specialized implementations from RandomDataImpl to the distributions that they sample from.
>
> 2.  Rename RandomDataImpl to merge the impl and the interface
>
> I think we all agree on 1.  Regarding 2, I would personally prefer to leave the represented distributions as is, sticking with just the most commonly used distributions, having the methods accept parameters, but maintaining generators as instance variables so generator state can be maintained between calls (like sample() and the existing RandomDataImpl behave now).  If others feel strongly that every distribution should be represented, I am OK with that but would see it as clutter in the RD replacement.

I definitely do agree on 1. I also agree on 2 (the rename/merge). However, 
I thought the goal of having RandomData is to be convenient:

  - Distributions require a separate instance per distributions. RandomData 
requires only a single instance.

  - Distributions require a separate instance per combination of 
parameters. RandomData allows providing the parameters when one actually 
asks for samples.

If so, then I don't see why that would be the case for only some of the 
distributions, and not for others...? Am I missing something here?

Dennis


> Phil
>
>
>>
>> Obviously, the implementation should be only in the distributions OR in the RandomDataImpl. I agree that in the distributions would be better. I like the static methods in distributions idea.
>>
>>> Phil
>>>
>>>>>
>>>>> Gilles
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Dennis Hendriks <D....@tue.nl>.
See below.

Dennis


On 08/06/2012 02:48 PM, Phil Steitz wrote:
>
>
>
>
> On Aug 5, 2012, at 11:21 PM, Dennis Hendriks<D....@tue.nl>  wrote:
>
>> See below.
>>
>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>> Hello.
>>>>>
>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>> ---CUT---
>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>>>>     at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>     at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>     at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>     at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>
>>>>>
>>>>> It is due to a precondition check while creating the
>>>>> "UniformIntegerDistribution" instance:
>>>>> ---CUT---
>>>>> if (lower>= upper) {
>>>>>      throw new NumberIsTooLargeException(
>>>>>                         LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>                         lower, upper, false);
>>>>> }
>>>>> ---CUT---
>>>>>
>>>>> The test referred to above was using this code (before I changed it use a
>>>>> "UniformIntegerDistribution" instance):
>>>>> ---CUT---
>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>>>> ---CUT---
>>>>>
>>>>> It is now (after the change):
>>>>> ---CUT---
>>>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>>>> ---CUT---
>>>>>
>>>>> Thus, AFAIK, the failure did not appear before because there was no
>>>>> precondition enforcement in "nextInt".
>>>>>
>>>>> The question is: Was the code in the test correct (in allowing the same
>>>>> value for both bounds?
>>>>>   * In the negative, how to change it?
>>>>>   * The affirmative would mean that the precondition check in
>>>>>     "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>>>     Does this make sense?
>>>>>     If so, can we change it now, or is it forbidden in order to stay
>>>>>     backwards compatible?
>>>>
>>>> Your analysis above is correct.  The failure after the change is due
>>>> to the fact that post-change the distribution is instantiated before
>>>> the bounds check.  I changed the test to fix this.
>>>
>>> Thanks.
>>>
>>>>   Both the
>>>> randomData nextInt and the UniformIntegerDistribution constructor
>>>> now forbid the degenerate case where there is only one point in the
>>>> domain.  In retrospect, I guess it would have probably been better
>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>>>
>>>> The original code above illustrates the convenience of being able to
>>>> just make direct calls to randomData.nextXxx, which is why this
>>>> class exists ;)
>>>
>>> As I wrote in another post, I'm not against the convenience methods. But
>>> IMO, they should be located in a new "DistributionUtils" class.
>>> And we should also find a way to remove the code duplication (in the
>>> distribution's "sample()" method and in the corresponding "next..." method).
>>>
>>
>> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
>>
>
> All have the method now, but the impls delegate to RandomDataImpl.  In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.

I'm not sure if I made myself clear. I meant to say that not all 
distributions have a corresponding nextX method in RandomData(Impl). What I 
propose is to make sure that for every distribution class, there is a 
corresponding method in RandomData(Impl) to make sure that RandomData(Impl) 
is actually a substitute for using the distributions.

Obviously, the implementation should be only in the distributions OR in the 
RandomDataImpl. I agree that in the distributions would be better. I like 
the static methods in distributions idea.

> Phil
>
>>>
>>> Gilles
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.



On Aug 5, 2012, at 11:21 PM, Dennis Hendriks <D....@tue.nl> wrote:

> See below.
> 
> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>> Hello.
>>>> 
>>>> Referring to this failed test (cf. messages from Continuum):
>>>> ---CUT---
>>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>>>    at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>    at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>    at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>    at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>> 
>>>> 
>>>> It is due to a precondition check while creating the
>>>> "UniformIntegerDistribution" instance:
>>>> ---CUT---
>>>> if (lower>= upper) {
>>>>     throw new NumberIsTooLargeException(
>>>>                        LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>                        lower, upper, false);
>>>> }
>>>> ---CUT---
>>>> 
>>>> The test referred to above was using this code (before I changed it use a
>>>> "UniformIntegerDistribution" instance):
>>>> ---CUT---
>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>>> ---CUT---
>>>> 
>>>> It is now (after the change):
>>>> ---CUT---
>>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>>> ---CUT---
>>>> 
>>>> Thus, AFAIK, the failure did not appear before because there was no
>>>> precondition enforcement in "nextInt".
>>>> 
>>>> The question is: Was the code in the test correct (in allowing the same
>>>> value for both bounds?
>>>>  * In the negative, how to change it?
>>>>  * The affirmative would mean that the precondition check in
>>>>    "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>>    Does this make sense?
>>>>    If so, can we change it now, or is it forbidden in order to stay
>>>>    backwards compatible?
>>> 
>>> Your analysis above is correct.  The failure after the change is due
>>> to the fact that post-change the distribution is instantiated before
>>> the bounds check.  I changed the test to fix this.
>> 
>> Thanks.
>> 
>>>  Both the
>>> randomData nextInt and the UniformIntegerDistribution constructor
>>> now forbid the degenerate case where there is only one point in the
>>> domain.  In retrospect, I guess it would have probably been better
>>> to allow this degenerate case.  Unfortunately, this would be an
>>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>> 
>>> The original code above illustrates the convenience of being able to
>>> just make direct calls to randomData.nextXxx, which is why this
>>> class exists ;)
>> 
>> As I wrote in another post, I'm not against the convenience methods. But
>> IMO, they should be located in a new "DistributionUtils" class.
>> And we should also find a way to remove the code duplication (in the
>> distribution's "sample()" method and in the corresponding "next..." method).
>> 
> 
> The RandomData class (or whatever it would be called) does indeed seem useful. If we plan to keep it, we should probably make sure that there is a sample/next/... method in that class for EVERY distribution, as some of them are missing, if I remember correctly. Perhaps this is a separate issue though?
> 

All have the method now, but the impls delegate to RandomDataImpl.  In some cases, there is nothing better implemented than just inversion, provided by the default inversion sampler.  That is OK.  What we need to do is just move the implementations of the default and specialized samplers to the actual distribution classes.  These can't be static, as they use the RamdomData instance.  I will take care of this.

Phil

>> 
>> Gilles
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Dennis Hendriks <D....@tue.nl>.
See below.

On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>> Hello.
>>>
>>> Referring to this failed test (cf. messages from Continuum):
>>> ---CUT---
>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
>>> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>
>>>
>>> It is due to a precondition check while creating the
>>> "UniformIntegerDistribution" instance:
>>> ---CUT---
>>> if (lower>= upper) {
>>>      throw new NumberIsTooLargeException(
>>>                         LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>                         lower, upper, false);
>>> }
>>> ---CUT---
>>>
>>> The test referred to above was using this code (before I changed it use a
>>> "UniformIntegerDistribution" instance):
>>> ---CUT---
>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
>>> ---CUT---
>>>
>>> It is now (after the change):
>>> ---CUT---
>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>> ---CUT---
>>>
>>> Thus, AFAIK, the failure did not appear before because there was no
>>> precondition enforcement in "nextInt".
>>>
>>> The question is: Was the code in the test correct (in allowing the same
>>> value for both bounds?
>>>   * In the negative, how to change it?
>>>   * The affirmative would mean that the precondition check in
>>>     "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>     Does this make sense?
>>>     If so, can we change it now, or is it forbidden in order to stay
>>>     backwards compatible?
>>
>> Your analysis above is correct.  The failure after the change is due
>> to the fact that post-change the distribution is instantiated before
>> the bounds check.  I changed the test to fix this.
>
> Thanks.
>
>>   Both the
>> randomData nextInt and the UniformIntegerDistribution constructor
>> now forbid the degenerate case where there is only one point in the
>> domain.  In retrospect, I guess it would have probably been better
>> to allow this degenerate case.  Unfortunately, this would be an
>> incompatible change, so will have to wait until 4.0 if we want to do it.
>>
>> The original code above illustrates the convenience of being able to
>> just make direct calls to randomData.nextXxx, which is why this
>> class exists ;)
>
> As I wrote in another post, I'm not against the convenience methods. But
> IMO, they should be located in a new "DistributionUtils" class.
> And we should also find a way to remove the code duplication (in the
> distribution's "sample()" method and in the corresponding "next..." method).
>

The RandomData class (or whatever it would be called) does indeed seem 
useful. If we plan to keep it, we should probably make sure that there is a 
sample/next/... method in that class for EVERY distribution, as some of 
them are missing, if I remember correctly. Perhaps this is a separate issue 
though?

>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
> > Hello.
> >
> > Referring to this failed test (cf. messages from Continuum):
> > ---CUT---
> > org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
> > 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
> > 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
> > 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
> > 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
> >
> >
> > It is due to a precondition check while creating the
> > "UniformIntegerDistribution" instance:
> > ---CUT---
> > if (lower >= upper) {
> >     throw new NumberIsTooLargeException(
> >                        LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
> >                        lower, upper, false);
> > }
> > ---CUT---
> >
> > The test referred to above was using this code (before I changed it use a
> > "UniformIntegerDistribution" instance):
> > ---CUT---
> > final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
> > ---CUT---
> >
> > It is now (after the change):
> > ---CUT---
> > final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
> > final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
> > ---CUT---
> >
> > Thus, AFAIK, the failure did not appear before because there was no
> > precondition enforcement in "nextInt".
> >
> > The question is: Was the code in the test correct (in allowing the same
> > value for both bounds?
> >  * In the negative, how to change it?
> >  * The affirmative would mean that the precondition check in
> >    "UniformIntegerDistribution" should be relaxed to allow equal bounds.
> >    Does this make sense?
> >    If so, can we change it now, or is it forbidden in order to stay
> >    backwards compatible?
> 
> Your analysis above is correct.  The failure after the change is due
> to the fact that post-change the distribution is instantiated before
> the bounds check.  I changed the test to fix this.

Thanks.

>  Both the
> randomData nextInt and the UniformIntegerDistribution constructor
> now forbid the degenerate case where there is only one point in the
> domain.  In retrospect, I guess it would have probably been better
> to allow this degenerate case.  Unfortunately, this would be an
> incompatible change, so will have to wait until 4.0 if we want to do it.
> 
> The original code above illustrates the convenience of being able to
> just make direct calls to randomData.nextXxx, which is why this
> class exists ;)

As I wrote in another post, I'm not against the convenience methods. But
IMO, they should be located in a new "DistributionUtils" class.
And we should also find a way to remove the code duplication (in the
distribution's "sample()" method and in the corresponding "next..." method).


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [MATH] Test failure in Continuum

Posted by Phil Steitz <ph...@gmail.com>.
On 8/4/12 10:57 AM, Gilles Sadowski wrote:
> Hello.
>
> Referring to this failed test (cf. messages from Continuum):
> ---CUT---
> org.apache.commons.math3.exception.NumberIsTooLargeException: lower bound (65) must be strictly less than upper bound (65)
> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
> 	at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
> 	at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>
>
> It is due to a precondition check while creating the
> "UniformIntegerDistribution" instance:
> ---CUT---
> if (lower >= upper) {
>     throw new NumberIsTooLargeException(
>                        LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>                        lower, upper, false);
> }
> ---CUT---
>
> The test referred to above was using this code (before I changed it use a
> "UniformIntegerDistribution" instance):
> ---CUT---
> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur, length - 1);
> ---CUT---
>
> It is now (after the change):
> ---CUT---
> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur, length - 1);
> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
> ---CUT---
>
> Thus, AFAIK, the failure did not appear before because there was no
> precondition enforcement in "nextInt".
>
> The question is: Was the code in the test correct (in allowing the same
> value for both bounds?
>  * In the negative, how to change it?
>  * The affirmative would mean that the precondition check in
>    "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>    Does this make sense?
>    If so, can we change it now, or is it forbidden in order to stay
>    backwards compatible?

Your analysis above is correct.  The failure after the change is due
to the fact that post-change the distribution is instantiated before
the bounds check.  I changed the test to fix this.  Both the
randomData nextInt and the UniformIntegerDistribution constructor
now forbid the degenerate case where there is only one point in the
domain.  In retrospect, I guess it would have probably been better
to allow this degenerate case.  Unfortunately, this would be an
incompatible change, so will have to wait until 4.0 if we want to do it.

The original code above illustrates the convenience of being able to
just make direct calls to randomData.nextXxx, which is why this
class exists ;)

Phil
>
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org