You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2003/06/05 06:05:14 UTC

[math] more improvement to storage free mean, variance computation

Check out procedure sum.2 and var.2 in

http://www.stanford.edu/~glynn/PDF/0208.pdf

The first looks like Brent's suggestion for a corrected mean 
computation, with no memory required.  The additional computational cost 
that I complained about is docuemented to be 3x the flops cost of the 
direct computation, but the computation is claimed to be more stable. So 
the question is: do we pay the flops cost to get the numerical 
stability?  The example in the paper is compelling; but it uses small 
words (err, numbers I mean -- sorry, slipped in to my native Fortran for 
a moment there ;-)).  So how do we go about deciding whether the 
stability in the mean computation is worth the increased computational 
effort?  I would prefer not to answer "let the user decide".  To make 
the decision harder, we should note that it is actually worse than 3x, 
since in the no storage version, the user may request the mean only 
rarely (if at all) and the 3x comparison is against computiing the mean 
for each value added.

The variance formula looks better than what we have now, still requiring 
no memory.  Should we implement this for the no storage case?

Phil


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org