You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@harfang.homelinux.org> on 2012/07/31 01:41:43 UTC

[Math] Package "o.a.c.m.distribution"

Hello.

Some questions about classes in package "o.a.c.m.distribution".

* Why doesn't "KolmogorovSmirnovDistribution" implement one of the
  interfaces of the package (and/or inherit from an abstract class)?
* Why is method "probability(double)" part of the "RealDistribution"
  interface? [All implementations return 0.]
* Shouldn't the "cumulativeProbability(double x0, double x1)" method be
  renamed "probability(double x0, double x1)"?


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Phil Steitz <ph...@gmail.com>.
On 7/31/12 3:21 PM, Gilles Sadowski wrote:
> On Tue, Jul 31, 2012 at 11:52:07AM -0700, Phil Steitz wrote:
>> On 7/31/12 3:34 AM, Dennis Hendriks wrote:
>>> See answers below.
>>>
>>> On 07/31/2012 12:00 PM, Gilles Sadowski wrote:
>>>> On Tue, Jul 31, 2012 at 08:18:13AM +0200, Dennis Hendriks wrote:
>>>>>> * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
>>>>>>     interfaces of the package (and/or inherit from an abstract
>>>>>> class)?
>>>>> See also:
>>>>> http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html
>>>> Thanks for digging that thread out.
>>>> Do I rightly deduce that it is unfinished work, which should give
>>>> rise to a
>>>> JIRA ticket?
>>> That would be my guess.
>> Yep.  Per the tread above, the reason this class exists is to enable
>> Kolmogorov-Smirnov tests, which it pretty much does.  See MATH-437. 
>> I have no idea how to implement the missing distribution methods or
>> if anyone would ever use them if we did, so it may be better to just
>> move this class to .stat.inference, which is what it is really for.
> +1
> Shall we deprecate (in "o.a.c.m.distribution") the whole class then?
> [And move the functionality as you propose.]
>
> Is a JIRA ticket needed for this, or a log message referring to MATH-437 is
> enough?
Honestly, I think reopening MATH-437 is probably best.  I will stop
being worthless and do that ;)
>
>>   Making it package scope there and adding a class with some doco
>> and wrapper methods to make setting up and executing K-S tests would
>> likely be more valuable than trying to fill out the distribution
>> methods.  Of course, if anyone has ideas on how to implement the
>> missing methods, patches are welcome :)
>>>>>> * Why is method "probability(double)" part of the
>>>>>> "RealDistribution"
>>>>>>     interface? [All implementations return 0.]
>>>>> My guess would be that it is left over from when integer and real
>>>>> distributions shared a common hierarchy.
>>>> I would agree with the guess.
>>>> Does everyone agree to deprecate this method (in 3.1) and remove
>>>> it (in
>>>> 4.0)?
>>> +1
>> -0 - Without this, we can't represent non-discrete distributions
>> that assign positive mass to singletons.  Could be no big loss, but
>> in the archives there is discussion of non-continuous distributions,
>> which such a beast would have to be, and I think we agreed that we
>> want to be able to support / represent them.
> Adding support later (when we have at least one implementation of such a
> thing) is always feasible.
> Keeping a useless method is mildly confusing.

Yeah, that is why I said -0, not -1.  On the other hand, adding it
back later would be incompatible change as this is an interface, so
may be best to leave as is.
>
>> We have split
>> distributions into Integer (which really means discrete) and "Real"
>> (which means essentially everything else).  So basically while all
>> currently implemented real distributions have probability(-)
>> identically zero, there are (non-discrete) distributions over the
>> reals that do not have this property and we may want to enable users
>> to represent them.
> I still think that CM is far from stable enough that users would base their
> code on CM's set of interfaces.
> IMO, interfaces should collect shared functionality of actual
> implementations, but there there is currently none for which the one-arg
> "probability" method is not trivial.
>
>> Could be these cases are so non-standard as to
>> not be worth worrying about; but I don't see harm in leaving the
>> method in, so -0 for the deprecation.
> I'd propose that, at least, the method is implemented in
> "AbstractRealDistribution", so that actual (non-exotic) distribution
> classes don't need to provide a useless method.

+1 for that.  Could be a good compromise.
>
>>>>>> * Shouldn't the "cumulativeProbability(double x0, double x1)"
>>>>>> method be
>>>>>>    renamed "probability(double x0, double x1)"?
>>>> Does everyone agree to deprecate this method (in 3.1) and remove
>>>> and replace
>>>> it (in 4.0) with a method named "probability(double x0, double x1)"?
>>> I would prefer to:
>>>  - add new method (3.1)
>>>  - deprecate old method (3.1)
>>>  - move impl to new method (3.1)
>>>  - redirect old method to new method (3.1)
>>>  - remove old method (4.0)
>>>
>>> That way, we can now change our code to use the new method, and
>>> not get any deprecation warnings on use of the old deprecated
>>> method, for which no alternative exists...
>> Sounds like a reasonable approach and +1 for the rename.
> All right; only the addition of the new method in the "RealDistribution"
> interface must be postponed to 4.0.

Yes, but the new method could be added to the implementations in 3.1.

>
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Tue, Jul 31, 2012 at 11:52:07AM -0700, Phil Steitz wrote:
> On 7/31/12 3:34 AM, Dennis Hendriks wrote:
> > See answers below.
> >
> > On 07/31/2012 12:00 PM, Gilles Sadowski wrote:
> >> On Tue, Jul 31, 2012 at 08:18:13AM +0200, Dennis Hendriks wrote:
> >>>> * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
> >>>>     interfaces of the package (and/or inherit from an abstract
> >>>> class)?
> >>>
> >>> See also:
> >>> http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html
> >>
> >> Thanks for digging that thread out.
> >> Do I rightly deduce that it is unfinished work, which should give
> >> rise to a
> >> JIRA ticket?
> >
> > That would be my guess.
> 
> Yep.  Per the tread above, the reason this class exists is to enable
> Kolmogorov-Smirnov tests, which it pretty much does.  See MATH-437. 
> I have no idea how to implement the missing distribution methods or
> if anyone would ever use them if we did, so it may be better to just
> move this class to .stat.inference, which is what it is really for.

+1
Shall we deprecate (in "o.a.c.m.distribution") the whole class then?
[And move the functionality as you propose.]

Is a JIRA ticket needed for this, or a log message referring to MATH-437 is
enough?

>   Making it package scope there and adding a class with some doco
> and wrapper methods to make setting up and executing K-S tests would
> likely be more valuable than trying to fill out the distribution
> methods.  Of course, if anyone has ideas on how to implement the
> missing methods, patches are welcome :)
> >
> >>>> * Why is method "probability(double)" part of the
> >>>> "RealDistribution"
> >>>>     interface? [All implementations return 0.]
> >>>
> >>> My guess would be that it is left over from when integer and real
> >>> distributions shared a common hierarchy.
> >>
> >> I would agree with the guess.
> >> Does everyone agree to deprecate this method (in 3.1) and remove
> >> it (in
> >> 4.0)?
> >
> > +1
> -0 - Without this, we can't represent non-discrete distributions
> that assign positive mass to singletons.  Could be no big loss, but
> in the archives there is discussion of non-continuous distributions,
> which such a beast would have to be, and I think we agreed that we
> want to be able to support / represent them.

Adding support later (when we have at least one implementation of such a
thing) is always feasible.
Keeping a useless method is mildly confusing.

> We have split
> distributions into Integer (which really means discrete) and "Real"
> (which means essentially everything else).  So basically while all
> currently implemented real distributions have probability(-)
> identically zero, there are (non-discrete) distributions over the
> reals that do not have this property and we may want to enable users
> to represent them.

I still think that CM is far from stable enough that users would base their
code on CM's set of interfaces.
IMO, interfaces should collect shared functionality of actual
implementations, but there there is currently none for which the one-arg
"probability" method is not trivial.

> Could be these cases are so non-standard as to
> not be worth worrying about; but I don't see harm in leaving the
> method in, so -0 for the deprecation.

I'd propose that, at least, the method is implemented in
"AbstractRealDistribution", so that actual (non-exotic) distribution
classes don't need to provide a useless method.

> >
> >>>> * Shouldn't the "cumulativeProbability(double x0, double x1)"
> >>>> method be
> >>>>    renamed "probability(double x0, double x1)"?
> >>
> >> Does everyone agree to deprecate this method (in 3.1) and remove
> >> and replace
> >> it (in 4.0) with a method named "probability(double x0, double x1)"?
> >
> > I would prefer to:
> >  - add new method (3.1)
> >  - deprecate old method (3.1)
> >  - move impl to new method (3.1)
> >  - redirect old method to new method (3.1)
> >  - remove old method (4.0)
> >
> > That way, we can now change our code to use the new method, and
> > not get any deprecation warnings on use of the old deprecated
> > method, for which no alternative exists...
> 
> Sounds like a reasonable approach and +1 for the rename.

All right; only the addition of the new method in the "RealDistribution"
interface must be postponed to 4.0.


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Phil Steitz <ph...@gmail.com>.
On 7/31/12 3:34 AM, Dennis Hendriks wrote:
> See answers below.
>
> On 07/31/2012 12:00 PM, Gilles Sadowski wrote:
>> On Tue, Jul 31, 2012 at 08:18:13AM +0200, Dennis Hendriks wrote:
>>>> * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
>>>>     interfaces of the package (and/or inherit from an abstract
>>>> class)?
>>>
>>> See also:
>>> http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html
>>
>> Thanks for digging that thread out.
>> Do I rightly deduce that it is unfinished work, which should give
>> rise to a
>> JIRA ticket?
>
> That would be my guess.

Yep.  Per the tread above, the reason this class exists is to enable
Kolmogorov-Smirnov tests, which it pretty much does.  See MATH-437. 
I have no idea how to implement the missing distribution methods or
if anyone would ever use them if we did, so it may be better to just
move this class to .stat.inference, which is what it is really for.
  Making it package scope there and adding a class with some doco
and wrapper methods to make setting up and executing K-S tests would
likely be more valuable than trying to fill out the distribution
methods.  Of course, if anyone has ideas on how to implement the
missing methods, patches are welcome :)
>
>>>> * Why is method "probability(double)" part of the
>>>> "RealDistribution"
>>>>     interface? [All implementations return 0.]
>>>
>>> My guess would be that it is left over from when integer and real
>>> distributions shared a common hierarchy.
>>
>> I would agree with the guess.
>> Does everyone agree to deprecate this method (in 3.1) and remove
>> it (in
>> 4.0)?
>
> +1
-0 - Without this, we can't represent non-discrete distributions
that assign positive mass to singletons.  Could be no big loss, but
in the archives there is discussion of non-continuous distributions,
which such a beast would have to be, and I think we agreed that we
want to be able to support / represent them.  We have split
distributions into Integer (which really means discrete) and "Real"
(which means essentially everything else).  So basically while all
currently implemented real distributions have probability(-)
identically zero, there are (non-discrete) distributions over the
reals that do not have this property and we may want to enable users
to represent them.  Could be these cases are so non-standard as to
not be worth worrying about; but I don't see harm in leaving the
method in, so -0 for the deprecation.
>
>>>> * Shouldn't the "cumulativeProbability(double x0, double x1)"
>>>> method be
>>>>    renamed "probability(double x0, double x1)"?
>>
>> Does everyone agree to deprecate this method (in 3.1) and remove
>> and replace
>> it (in 4.0) with a method named "probability(double x0, double x1)"?
>
> I would prefer to:
>  - add new method (3.1)
>  - deprecate old method (3.1)
>  - move impl to new method (3.1)
>  - redirect old method to new method (3.1)
>  - remove old method (4.0)
>
> That way, we can now change our code to use the new method, and
> not get any deprecation warnings on use of the old deprecated
> method, for which no alternative exists...

Sounds like a reasonable approach and +1 for the rename.

Phil
>
>>
>> Regards,
>> Gilles
>>
>> ---------------------------------------------------------------------
>>
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
> Dennis
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Dennis Hendriks <D....@tue.nl>.
See answers below.

On 07/31/2012 12:00 PM, Gilles Sadowski wrote:
> On Tue, Jul 31, 2012 at 08:18:13AM +0200, Dennis Hendriks wrote:
>>> * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
>>>     interfaces of the package (and/or inherit from an abstract class)?
>>
>> See also: http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html
>
> Thanks for digging that thread out.
> Do I rightly deduce that it is unfinished work, which should give rise to a
> JIRA ticket?

That would be my guess.

>>> * Why is method "probability(double)" part of the "RealDistribution"
>>>     interface? [All implementations return 0.]
>>
>> My guess would be that it is left over from when integer and real
>> distributions shared a common hierarchy.
>
> I would agree with the guess.
> Does everyone agree to deprecate this method (in 3.1) and remove it (in
> 4.0)?

+1

>>> * Shouldn't the "cumulativeProbability(double x0, double x1)" method be
>>>    renamed "probability(double x0, double x1)"?
>
> Does everyone agree to deprecate this method (in 3.1) and remove and replace
> it (in 4.0) with a method named "probability(double x0, double x1)"?

I would prefer to:
  - add new method (3.1)
  - deprecate old method (3.1)
  - move impl to new method (3.1)
  - redirect old method to new method (3.1)
  - remove old method (4.0)

That way, we can now change our code to use the new method, and not get any 
deprecation warnings on use of the old deprecated method, for which no 
alternative exists...

>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

Dennis

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Tue, Jul 31, 2012 at 08:18:13AM +0200, Dennis Hendriks wrote:
> > * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
> >    interfaces of the package (and/or inherit from an abstract class)?
> 
> See also: http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html

Thanks for digging that thread out.
Do I rightly deduce that it is unfinished work, which should give rise to a
JIRA ticket?

> 
> > * Why is method "probability(double)" part of the "RealDistribution"
> >    interface? [All implementations return 0.]
> 
> My guess would be that it is left over from when integer and real
> distributions shared a common hierarchy.

I would agree with the guess.
Does everyone agree to deprecate this method (in 3.1) and remove it (in
4.0)?

> >* Shouldn't the "cumulativeProbability(double x0, double x1)" method be
> >   renamed "probability(double x0, double x1)"?

Does everyone agree to deprecate this method (in 3.1) and remove and replace
it (in 4.0) with a method named "probability(double x0, double x1)"?


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Package "o.a.c.m.distribution"

Posted by Dennis Hendriks <D....@tue.nl>.
 > * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
 >    interfaces of the package (and/or inherit from an abstract class)?

See also: 
http://apache-commons.680414.n4.nabble.com/math-KolmogorovSmirnovDistribution-not-part-of-the-distribution-hierarchy-td4114395.html

 > * Why is method "probability(double)" part of the "RealDistribution"
 >    interface? [All implementations return 0.]

My guess would be that it is left over from when integer and real 
distributions shared a common hierarchy.

Dennis


On 07/31/2012 01:41 AM, Gilles Sadowski wrote:
> Hello.
>
> Some questions about classes in package "o.a.c.m.distribution".
>
> * Why doesn't "KolmogorovSmirnovDistribution" implement one of the
>    interfaces of the package (and/or inherit from an abstract class)?
> * Why is method "probability(double)" part of the "RealDistribution"
>    interface? [All implementations return 0.]
> * Shouldn't the "cumulativeProbability(double x0, double x1)" method be
>    renamed "probability(double x0, double x1)"?
>
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org