You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles <gi...@harfang.homelinux.org> on 2016/05/07 17:11:46 UTC

[Math] Design of "o.a.c.m.stat.descriptive.moment"

Hi.

Looking at some way out for MATH-1228, I uncovered what looks to me
as an abuse of inheritance: "ThirdMoment" extends "SecondMoment".

"Skewness" contains a "ThirdMoment" instance variable but, in order
to compute its value, "m3" (cf. "getResult()"), it also must read
the result of that variable's parent type ("SecondMoment"), i.e. "m2".

IMO, this makes no sense at the conceptual level: a third moment is
_not_ a second moment.

And the problem is compounded as:
  * "FourthMoment" inherits from "ThirdMoment", and
  * "SecondMoment" extends "FirstMoment".

In this strange hierarchy, the fields are all "protected" in order
to be accessed from subclasses, making the whole thing fragile and
inefficient (as pointed out by the OP of MATH-1228).

 From a programming POV, the design is flawed because "getResult()"
is, rightfully, intended to be overridden in subclasses.
However, "protected" access is used to work around that very purpose
of inheritance.
And thus, good practice (cf. MATH-758) cannot be implemented.

Did I miss the point of this package's design?
Or am I right that it should be completely redesigned?


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Design of "o.a.c.m.stat.descriptive.moment"

Posted by Gilles <gi...@harfang.homelinux.org>.
Hello.

On Mon, 16 May 2016 16:42:12 -0700, michael.brzustowicz@gmail.com 
wrote:
> Hi Giles,
> I ran into the same problem. Many months ago I asked about higher 
> order
> moments in Multivariate Stats. I pulled the code and implemented 
> third and
> fourth order moments in MVStats. I had to "unprotect" some methods 
> and
> mostly got everything working. The problem arose (I think) in one of 
> the
> unit tests for AggregateBlahBlah where the code actually uses stat
> quantities (like std dev) to go backwards to get 2nd moment. I gave 
> up when
> I saw that :)
> I think it could be reimplemented easier.

I'd think so too.
But the devil is often in the details...

> Why not compose a class called
> StatisticalMoments which contains 1st, 2nd, 3rd, 4th moments as hard 
> coded
> variables, with hard coded methods for updating those moments and 
> merging
> them. Higher order moments (than fourth) could be contained in array 
> and
> the requisite update methods could be the generic expressions for 
> n-order
> update / merge.
> In this way, all other stat classes could inherit the 
> StatisticalMoments
> class.
> Maybe you have all thought of doing it this way already, and it 
> failed
> spectacularly for some reason :)

I did not think of it much.[1]
There are certainly quite a few things that could be improved. A some
are already reported on the bug-tracking system...

If you are willing to try a new approach, do not hesitate.
We can make room for a new package to that end and then compare actual
implementations for design clarity, functionality and performance.

Regards,
Gilles

[1] To be more accurate: I started to think about it, but the 
(political)
     effort was not worth it...

> Thanx,
> Mike Brzustowicz
>
> On Sat, May 7, 2016 at 10:11 AM, Gilles 
> <gi...@harfang.homelinux.org>
> wrote:
>
>> Hi.
>>
>> Looking at some way out for MATH-1228, I uncovered what looks to me
>> as an abuse of inheritance: "ThirdMoment" extends "SecondMoment".
>>
>> "Skewness" contains a "ThirdMoment" instance variable but, in order
>> to compute its value, "m3" (cf. "getResult()"), it also must read
>> the result of that variable's parent type ("SecondMoment"), i.e. 
>> "m2".
>>
>> IMO, this makes no sense at the conceptual level: a third moment is
>> _not_ a second moment.
>>
>> And the problem is compounded as:
>>  * "FourthMoment" inherits from "ThirdMoment", and
>>  * "SecondMoment" extends "FirstMoment".
>>
>> In this strange hierarchy, the fields are all "protected" in order
>> to be accessed from subclasses, making the whole thing fragile and
>> inefficient (as pointed out by the OP of MATH-1228).
>>
>> From a programming POV, the design is flawed because "getResult()"
>> is, rightfully, intended to be overridden in subclasses.
>> However, "protected" access is used to work around that very purpose
>> of inheritance.
>> And thus, good practice (cf. MATH-758) cannot be implemented.
>>
>> Did I miss the point of this package's design?
>> Or am I right that it should be completely redesigned?
>>
>>
>> Regards,
>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Design of "o.a.c.m.stat.descriptive.moment"

Posted by "michael.brzustowicz@gmail.com" <mi...@gmail.com>.
Hi Giles,
I ran into the same problem. Many months ago I asked about higher order
moments in Multivariate Stats. I pulled the code and implemented third and
fourth order moments in MVStats. I had to "unprotect" some methods and
mostly got everything working. The problem arose (I think) in one of the
unit tests for AggregateBlahBlah where the code actually uses stat
quantities (like std dev) to go backwards to get 2nd moment. I gave up when
I saw that :)
I think it could be reimplemented easier. Why not compose a class called
StatisticalMoments which contains 1st, 2nd, 3rd, 4th moments as hard coded
variables, with hard coded methods for updating those moments and merging
them. Higher order moments (than fourth) could be contained in array and
the requisite update methods could be the generic expressions for n-order
update / merge.
In this way, all other stat classes could inherit the StatisticalMoments
class.
Maybe you have all thought of doing it this way already, and it failed
spectacularly for some reason :)
Thanx,
Mike Brzustowicz

On Sat, May 7, 2016 at 10:11 AM, Gilles <gi...@harfang.homelinux.org>
wrote:

> Hi.
>
> Looking at some way out for MATH-1228, I uncovered what looks to me
> as an abuse of inheritance: "ThirdMoment" extends "SecondMoment".
>
> "Skewness" contains a "ThirdMoment" instance variable but, in order
> to compute its value, "m3" (cf. "getResult()"), it also must read
> the result of that variable's parent type ("SecondMoment"), i.e. "m2".
>
> IMO, this makes no sense at the conceptual level: a third moment is
> _not_ a second moment.
>
> And the problem is compounded as:
>  * "FourthMoment" inherits from "ThirdMoment", and
>  * "SecondMoment" extends "FirstMoment".
>
> In this strange hierarchy, the fields are all "protected" in order
> to be accessed from subclasses, making the whole thing fragile and
> inefficient (as pointed out by the OP of MATH-1228).
>
> From a programming POV, the design is flawed because "getResult()"
> is, rightfully, intended to be overridden in subclasses.
> However, "protected" access is used to work around that very purpose
> of inheritance.
> And thus, good practice (cf. MATH-758) cannot be implemented.
>
> Did I miss the point of this package's design?
> Or am I right that it should be completely redesigned?
>
>
> Regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>