You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Piotr Kochański <pi...@uw.edu.pl> on 2004/01/19 13:03:34 UTC

[math] Re: bootstrap confidence intervals

Phil Steitz napisał(a):

> Excellent points. We would appreciate any other comments (or patches :-)
> that you have on the code or algorithms in [math]. 

Calculation of confidence intervals using bootstrap 
is a bit complicated, so I started with Standard Error - 
basically, one might calculate bootstrap CI using such SE, 
however this is often not good enough in practice.

In order to calculate CI in a right way a fast algorithm
for calculation of normal distribution function is
needed, it is not present in commons-math (as far as
I can remember I send some code doing this to
a dev mailing list).

I attach classes, which calculate SE using bootstrap,
if you are interested in adding this code please let me know, 
I will have to clean the code, javadoc and write tests.
All information about package organization, etc. would be
helpfull as well.

Here I put everything into o.a.c.math.stat.bootstrap
package. 

The short description goes below:

Test class shows the usage of bootstrap to calculate
Standard Error.

Bootstrap class does actual resampling - it provides also
a method that returns an array of values of some
statistics calculated for every sample.

StandardError is an interface that is common for
all possible ways of calculating standard error.
MeanStandardError and BootstrapStandardError implements
this interface to calculate SE in a particular situation.

The usage is the follwing:
StandardError bse = new BootstrapStandardError();
		
((BootstrapStandardError)bse).setB(200);
		
bse.setStat(new Mean()); 
System.out.println("se for the mean (bootstrap): " +
bse.getStandardError(valSmall));
bse.setStat(new Median());
System.out.println("se for the median (bootstrap): " +
bse.getStandardError(valSmall));		
	
StandardError se = new MeanStandardError();
se.setStat(new Mean());
System.out.println("se for the mean (standard formula): " +
se.getStandardError(valSmall));
		
System.out.println("---");
//this requieres patching the Mean class 		
Mean m = new Mean();
m.setStandardError(bse);
System.out.println("se for the mean (bootstrap thru Mean): " +
m.getStandardError(valSmall));


I would add Confidence Intervals calculation code in
a similar way, if this is ok. for you.

Regards

Piotr Kochanski

> --- Piotr Kochański <pi...@uw.edu.pl> wrote:
> 
> > Another problem with bootstrap confidence intervals is that
> > they are non-parametric, and, inevitably, they provide less
> > power when doing statistical tests then any parametric method.
> > 
> > Some people can be dissapointed with the fact that it is harder
> > to obtain significant results, so I think that usuall calculation method,
> > based on normal distribution assumption, should be also provided
> > (it can be done only for a few statistics though).
> > 
> > There is also a "parametric" bootstrap, but this cannot be programmed
> > in a generic way, applicable in any situation.
> > 
> > But still, bootstrap is very safe solution, given it's distribiution
> > independence
> > and ability to use it easyly for any statistics.
> > 
> > Piotr Kochanski (pi@uw.edu.pl)
> > 
>  
> 
> 
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
> http://hotjobs.sweepstakes.yahoo.com/signingbonus
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




Re: [math] Re: bootstrap confidence intervals

Posted by Phil Steitz <ph...@steitz.com>.
Piotr,

Thanks for this patch. I like the simple boostrap impl, which could be 
applied to any of our statistics.  Pls do clean up and resubmit with 
Apache License on each file.  It might be easier to track if you attached 
it to a Bugzilla enhancement request.
> 
> Calculation of confidence intervals using bootstrap 
> is a bit complicated, so I started with Standard Error - 
> basically, one might calculate bootstrap CI using such SE, 
> however this is often not good enough in practice.

Yes, but the bootstrap SE is valuable, nonetheless.  I would like 
eventually to include an impl of direct bootstrap confidence intervals, 
which really means computing the bootstrap distribution.
> 
> In order to calculate CI in a right way a fast algorithm
> for calculation of normal distribution function is
> needed, it is not present in commons-math (as far as
> I can remember I send some code doing this to
> a dev mailing list).

This depends on the statistic, of course.  For means and regression 
parameters, where the associated standard deviations are estimated from 
the data, what you need is the t distribution, which we have.  An example 
is getSlopeConfidenceInterval in the BivariateRegression impl.

I will review the normal distribution patch, however, as I agree that this 
would be a good addition in any case.

Phil


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org