You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by "michael.brzustowicz@gmail.com" <mi...@gmail.com> on 2015/12/22 17:58:59 UTC

[math] Summary Stats Higher Moments?

Hi,

I see that org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
uses the singleton update formulas (from Pebay) for calculating
(un-normalized) moments up to the 4th moment. Is there some reason that
org.apache.commons.math3.stat.descriptive.SummaryStatistics excludes both
third and fourth central moments?

Is it just a matter of computational efficiency, ie. DescriptiveStatistics
calculates moments only when the getter is invoked (and all orders need not
be calculated at once) while the "storeless" SummaryStatistics would need
to calculate all 4 orders at every call to update()? Or is there some other
blocker?

Thanx,
Mike Brzustowicz

Re: [math] Summary Stats Higher Moments?

Posted by Phil Steitz <ph...@gmail.com>.
On 12/28/15 3:55 PM, michael.brzustowicz@gmail.com wrote:
> Hi Phil,
> That would be great! I think adding third, fourth moments and skewness,
> kurtosis would be a very useful addition.
>
> Also, considering the formulas in Pebay, perhaps a method like
>
> void merge(SummaryStatistics ss) {
>     // use Pebay update formulas to merge un-normalized moments of ss with
> "this"
>    // if one is singleton use "update" method instead
> }
>
> could be added to SummaryStatistics in addition to "update". I realize
> AggregateSummaryStatistics takes care of merging 1st,2nd order stats, so
> this may be redundant.

I think AggregateSummaryStatistics takes care of the general case
for this; but IIRC we did at one point talk about adding a bivariate
method like you have above.  Seems a reasonable addition to
SummaryStatistics.
>
> If there is something I can do to help, please let me know.
> -Mike Brzustowicz

What you are most welcome to do is to open a JIRA asking for the
features above and below and attach patches implementing the
features and adding test cases.  Ask here or offlist if you have any
questions about how to work with git, JIRA, maven etc.

Phil
>
> On Wed, Dec 23, 2015 at 5:44 AM, Phil Steitz <ph...@gmail.com> wrote:
>
>> On 12/22/15 9:58 AM, michael.brzustowicz@gmail.com wrote:
>>> Hi,
>>>
>>> I see that
>> org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
>>> uses the singleton update formulas (from Pebay) for calculating
>>> (un-normalized) moments up to the 4th moment. Is there some reason that
>>> org.apache.commons.math3.stat.descriptive.SummaryStatistics excludes both
>>> third and fourth central moments?
>>>
>>> Is it just a matter of computational efficiency, ie.
>> DescriptiveStatistics
>>> calculates moments only when the getter is invoked (and all orders need
>> not
>>> be calculated at once) while the "storeless" SummaryStatistics would need
>>> to calculate all 4 orders at every call to update()?
>> Yes, that is the reason; but it is really more a matter of no one
>> having asked for this feature.  You are correct that the updating
>> formulas make this possible and the nested nature of the moments
>> means that there should not be much cost to adding the third and
>> fourth moments.  I would be happy to review and apply a patch (with
>> tests) adding these.
>>
>> Phil
>>
>>>  Or is there some other
>>> blocker?
>>>
>>> Thanx,
>>> Mike Brzustowicz
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: [math] Summary Stats Higher Moments?

Posted by "michael.brzustowicz@gmail.com" <mi...@gmail.com>.
Hi Phil,
That would be great! I think adding third, fourth moments and skewness,
kurtosis would be a very useful addition.

Also, considering the formulas in Pebay, perhaps a method like

void merge(SummaryStatistics ss) {
    // use Pebay update formulas to merge un-normalized moments of ss with
"this"
   // if one is singleton use "update" method instead
}

could be added to SummaryStatistics in addition to "update". I realize
AggregateSummaryStatistics takes care of merging 1st,2nd order stats, so
this may be redundant.

If there is something I can do to help, please let me know.
-Mike Brzustowicz

On Wed, Dec 23, 2015 at 5:44 AM, Phil Steitz <ph...@gmail.com> wrote:

> On 12/22/15 9:58 AM, michael.brzustowicz@gmail.com wrote:
> > Hi,
> >
> > I see that
> org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
> > uses the singleton update formulas (from Pebay) for calculating
> > (un-normalized) moments up to the 4th moment. Is there some reason that
> > org.apache.commons.math3.stat.descriptive.SummaryStatistics excludes both
> > third and fourth central moments?
> >
> > Is it just a matter of computational efficiency, ie.
> DescriptiveStatistics
> > calculates moments only when the getter is invoked (and all orders need
> not
> > be calculated at once) while the "storeless" SummaryStatistics would need
> > to calculate all 4 orders at every call to update()?
>
> Yes, that is the reason; but it is really more a matter of no one
> having asked for this feature.  You are correct that the updating
> formulas make this possible and the nested nature of the moments
> means that there should not be much cost to adding the third and
> fourth moments.  I would be happy to review and apply a patch (with
> tests) adding these.
>
> Phil
>
> >  Or is there some other
> > blocker?
> >
> > Thanx,
> > Mike Brzustowicz
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Re: [math] Summary Stats Higher Moments?

Posted by Phil Steitz <ph...@gmail.com>.
On 12/22/15 9:58 AM, michael.brzustowicz@gmail.com wrote:
> Hi,
>
> I see that org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
> uses the singleton update formulas (from Pebay) for calculating
> (un-normalized) moments up to the 4th moment. Is there some reason that
> org.apache.commons.math3.stat.descriptive.SummaryStatistics excludes both
> third and fourth central moments?
>
> Is it just a matter of computational efficiency, ie. DescriptiveStatistics
> calculates moments only when the getter is invoked (and all orders need not
> be calculated at once) while the "storeless" SummaryStatistics would need
> to calculate all 4 orders at every call to update()?

Yes, that is the reason; but it is really more a matter of no one
having asked for this feature.  You are correct that the updating
formulas make this possible and the nested nature of the moments
means that there should not be much cost to adding the third and
fourth moments.  I would be happy to review and apply a patch (with
tests) adding these.

Phil

>  Or is there some other
> blocker?
>
> Thanx,
> Mike Brzustowicz
>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org