You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2003/05/21 16:10:41 UTC
[math] Priorities, help needed
I am working on getting myself set up with Maven, but I wanted to get
this list out to any who might be willing to a) contribute or b) comment
on priorities or direction.
The proposal presents the following initial scope:
* Simple univariate statistics (mean, standard deviation, n,
confidence intervals)
* Frequency distributions
* t-test, chi-square test
* Random numbers from Gaussian, Exponential, Poisson distributions
* Random sampling/resampling
* Bivariate regression, corellation
and mathematical algorithms such as the following:
* Basic Complex Number representation with algebraic operations
* Newton's method for finding roots
* Binomial coefficients
* Exponential growth and decay (set up for financial applications)
* Polynomial Interpolation (curve fitting)
* Basic Matrix representation with algebraic operations
The following items need completion:
* Univariate needs confidence intervals. I would recommend doing this
by first defining a t-statistic in TestStatistic and then using it.
This is very simple. "Nice to haves" (IMHO) for Univariate would be
addition of quantiles (1,5,10,25,50,75,90,95,99) and boostrap
confidence intervals for the versions that
store data and maybe higher order moments (if possible) for
UnivariateImpl. I would prioritize the quantiles (most important) and
t-based confidence intervals over the higher order moments or
bootstrap confidence intervals.
* t-test statistic needs to be added and we should probably add the
capability of actually performing t- and chi-square tests at fixed
significance levels (.1, .05, .01, .001). Down the road, numerical
approximation of the t- and chi-square distributions could be added to
enable user-supplied significance levels. Also, more tests.
* the RealMatrixImpl class is missing some key method implementations.
The critical thing is inversion. We need to implement a numerically
sound inversion algorithm. This will enable solve() and also
support general linear regression.
The following items have no submitted implementation. I will continue
to submit solutions for these things, but obviously we need more,
better, faster:-)
* ComplexNumber interface and implementation. The only tricky thing
here is making division numerically sound and what extended value
topology to adopt. If no one else jumps on this, I will submit a
cleaned up version of what I have, along with some references.
* Bivariate Regression, corellation. This could be done with simple
formulas manipulating arrays and this is probably what we should aim
for in an initial release. Down the road, we should use the
RealMatrixImpl solve() to support general linear regression. I have
an implementation (of simple regression) that I could clean up and
submit; but again, I would be glad to let someone else submit this.
* Binomial coefficients I have an "exact" implementation that is
limited to what can be stored in a long. This should be extended to
use BigIntegers and potentially to support logarithmic
representations.
The following are items for which I do not have full Java code:
* Newton's method for finding roots
* Exponential growth and decay (set up for financial applications)
* Polynomial Interpolation (curve fitting)
* Sampling from Collections (maybe belongs in Collections???)
It would be a good idea for us to agree on priorities. Personally, I
would list things more or less in the order presented above.
Obviously, one more thing that we need help on is documentation. My
personal top priority is to get some basic material submitted for the
maven site. Finally, there is *lots* of cleanup to do in the existing
code and javadoc and more test cases to add (esp. tests for the
"rolling" capability in UnivariateImpl).
Regards,
Phil
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [math] Priorities, help needed
Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
Phil Steitz wrote:
> Obviously, one more thing that we need help on is documentation. My
> personal top priority is to get some basic material submitted for the
> maven site.
>
Here's another source to reference in the javadoc for particular
implementations:
http://mathworld.wolfram.com/
This is where I quickly grabbed Geometric Mean:
http://mathworld.wolfram.com/GeometricMean.html
They have excellent info, often with great details. I just got done
using Figurate Numbers to quickly determine different sized/shaped
neighborhoods in cellular automata.
http://mathworld.wolfram.com/FigurateNumber.html
-Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org