You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Piotr Kochański <pi...@uw.edu.pl> on 2004/01/19 13:03:34 UTC
[math] Re: bootstrap confidence intervals
Phil Steitz napisał(a):
> Excellent points. We would appreciate any other comments (or patches :-)
> that you have on the code or algorithms in [math].
Calculation of confidence intervals using bootstrap
is a bit complicated, so I started with Standard Error -
basically, one might calculate bootstrap CI using such SE,
however this is often not good enough in practice.
In order to calculate CI in a right way a fast algorithm
for calculation of normal distribution function is
needed, it is not present in commons-math (as far as
I can remember I send some code doing this to
a dev mailing list).
I attach classes, which calculate SE using bootstrap,
if you are interested in adding this code please let me know,
I will have to clean the code, javadoc and write tests.
All information about package organization, etc. would be
helpfull as well.
Here I put everything into o.a.c.math.stat.bootstrap
package.
The short description goes below:
Test class shows the usage of bootstrap to calculate
Standard Error.
Bootstrap class does actual resampling - it provides also
a method that returns an array of values of some
statistics calculated for every sample.
StandardError is an interface that is common for
all possible ways of calculating standard error.
MeanStandardError and BootstrapStandardError implements
this interface to calculate SE in a particular situation.
The usage is the follwing:
StandardError bse = new BootstrapStandardError();
((BootstrapStandardError)bse).setB(200);
bse.setStat(new Mean());
System.out.println("se for the mean (bootstrap): " +
bse.getStandardError(valSmall));
bse.setStat(new Median());
System.out.println("se for the median (bootstrap): " +
bse.getStandardError(valSmall));
StandardError se = new MeanStandardError();
se.setStat(new Mean());
System.out.println("se for the mean (standard formula): " +
se.getStandardError(valSmall));
System.out.println("---");
//this requieres patching the Mean class
Mean m = new Mean();
m.setStandardError(bse);
System.out.println("se for the mean (bootstrap thru Mean): " +
m.getStandardError(valSmall));
I would add Confidence Intervals calculation code in
a similar way, if this is ok. for you.
Regards
Piotr Kochanski
> --- Piotr Kochański <pi...@uw.edu.pl> wrote:
>
> > Another problem with bootstrap confidence intervals is that
> > they are non-parametric, and, inevitably, they provide less
> > power when doing statistical tests then any parametric method.
> >
> > Some people can be dissapointed with the fact that it is harder
> > to obtain significant results, so I think that usuall calculation method,
> > based on normal distribution assumption, should be also provided
> > (it can be done only for a few statistics though).
> >
> > There is also a "parametric" bootstrap, but this cannot be programmed
> > in a generic way, applicable in any situation.
> >
> > But still, bootstrap is very safe solution, given it's distribiution
> > independence
> > and ability to use it easyly for any statistics.
> >
> > Piotr Kochanski (pi@uw.edu.pl)
> >
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
> http://hotjobs.sweepstakes.yahoo.com/signingbonus
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
Re: [math] Re: bootstrap confidence intervals
Posted by Phil Steitz <ph...@steitz.com>.
Piotr,
Thanks for this patch. I like the simple boostrap impl, which could be
applied to any of our statistics. Pls do clean up and resubmit with
Apache License on each file. It might be easier to track if you attached
it to a Bugzilla enhancement request.
>
> Calculation of confidence intervals using bootstrap
> is a bit complicated, so I started with Standard Error -
> basically, one might calculate bootstrap CI using such SE,
> however this is often not good enough in practice.
Yes, but the bootstrap SE is valuable, nonetheless. I would like
eventually to include an impl of direct bootstrap confidence intervals,
which really means computing the bootstrap distribution.
>
> In order to calculate CI in a right way a fast algorithm
> for calculation of normal distribution function is
> needed, it is not present in commons-math (as far as
> I can remember I send some code doing this to
> a dev mailing list).
This depends on the statistic, of course. For means and regression
parameters, where the associated standard deviations are estimated from
the data, what you need is the t distribution, which we have. An example
is getSlopeConfidenceInterval in the BivariateRegression impl.
I will review the normal distribution patch, however, as I agree that this
would be a good addition in any case.
Phil
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org