You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "Phil Steitz (JIRA)" <ji...@apache.org> on 2007/04/01 22:21:32 UTC

[jira] Commented: (MATH-163) The evaluate method and the getResult method of class Variance give different results

    [ https://issues.apache.org/jira/browse/MATH-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485920 ] 

Phil Steitz commented on MATH-163:
----------------------------------

Thanks for reporting this.  I agree with Rory that the spirit of IEEE754 (which says examine limit as x -> INF when evaluating expressions involving INF) implies the result of this computation should be positive infinity in this particular case, as the getResult() method gives.  For reasons described below, it may be difficult, however, to correctly handle all INF cases without impacting performance, so I am leaning toward WONTFIX at this point; though open to suggestions / patches.

The reason that the results of the two methods are different is they use different computing formulas.  The getResult method is meant to be used when the data is not persisted - i.e., after repeatedly calling increment, supplying values in a stream (and updating sums), but not storing the whole set of values.  It therefore uses a "one pass" algorithm ("West's algorithm", referenced in javadoc) to compute the variance.  The evaluate method exploits the fact that it has the full array of values supplied and uses a two-pass method ("corrected two-pass algorithm" from Chan, Golub, Levesque, Algorithms for Computing the Sample Variance, American Statistician, August 1983).  These methods may give different results in some examples, with the second more accurate.  The javadoc should be improved to make this clearer and to recommend that evaluate  should be preferred over incrementAll-getResult when the full array of values is available.  That I will do.

> The evaluate method and the getResult method of class Variance give different results
> -------------------------------------------------------------------------------------
>
>                 Key: MATH-163
>                 URL: https://issues.apache.org/jira/browse/MATH-163
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Nele Smeets
>
> Consider the following test code:
>   // construct an array of input values, containing infinity  
>   double[] values = new double[] {1.0, 2.0, Double.POSITIVE_INFINITY};
>   // find the variance using Variance.evaluate(double[])
>   Variance var1 = new Variance();
>   double value1 = var1.evaluate(values);
>   // find the variance using Variance.getResult()
>   Variance var2 = new Variance();
>   var2.incrementAll(values);
>   double value2 = var2.getResult();
>   // print out the results
>   System.out.println(value1);
>   System.out.println(value2);
> This code prints out:
> NaN
> Infinity
> So, we get two different variances, depending on the method we use. 
> (The same is true when we use Double.NEGATIVE_INFINITY as input value instead of Double.POSITIVE_INFINITY.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org