You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@gmail.com> on 2011/08/05 20:48:04 UTC

Re: svn commit: r1154250 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math/util/MathUtils.java site/xdoc/changes.xml

I am +1 on the idea here and am wondering if we should think a
little more radically even.  Greg has - correctly, IMO - complained
that the current structure of the linear package makes it awkward to
implement optimized operations for some classes of matrices and
operations.  I a wonder whether it might not make sense to
encapsulate high-performance, array-based implementations of core
operations in utils classes that can be used like the methods
below.  I remember back in the very early days of [math] suggesting
this for statistical algorithms and being talked out of it in favor
of a more classical OO approach.  I wonder if by organizing things
correctly, we might not be able to have the best of both worlds.

Phil

On 8/5/11 8:01 AM, luc@apache.org wrote:
> Author: luc
> Date: Fri Aug  5 15:01:49 2011
> New Revision: 1154250
>
> URL: http://svn.apache.org/viewvc?rev=1154250&view=rev
> Log:
> Added a few linearCombination utility methods in MathUtils to compute accurately
> linear combinations a1.b1 + a2.b2 + ... + an.bn taking great care to compensate
> for cancellation effects. This both improves and simplify several methods in
> euclidean geometry classes, including linear constructors, dot product and cross
> product.
>
> Modified:
>     commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java
>     commons/proper/math/trunk/src/site/xdoc/changes.xml
>
> Modified: commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java
> URL: http://svn.apache.org/viewvc/commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java?rev=1154250&r1=1154249&r2=1154250&view=diff
> ==============================================================================
> --- commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java (original)
> +++ commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java Fri Aug  5 15:01:49 2011
> @@ -19,22 +19,22 @@ package org.apache.commons.math.util;
>  
>  import java.math.BigDecimal;
>  import java.math.BigInteger;
> -import java.util.Arrays;
> -import java.util.List;
>  import java.util.ArrayList;
> -import java.util.Comparator;
> +import java.util.Arrays;
>  import java.util.Collections;
> +import java.util.Comparator;
> +import java.util.List;
>  
> -import org.apache.commons.math.exception.util.Localizable;
> -import org.apache.commons.math.exception.util.LocalizedFormats;
> -import org.apache.commons.math.exception.NonMonotonousSequenceException;
>  import org.apache.commons.math.exception.DimensionMismatchException;
> -import org.apache.commons.math.exception.NullArgumentException;
> -import org.apache.commons.math.exception.NotPositiveException;
>  import org.apache.commons.math.exception.MathArithmeticException;
>  import org.apache.commons.math.exception.MathIllegalArgumentException;
> -import org.apache.commons.math.exception.NumberIsTooLargeException;
> +import org.apache.commons.math.exception.NonMonotonousSequenceException;
>  import org.apache.commons.math.exception.NotFiniteNumberException;
> +import org.apache.commons.math.exception.NotPositiveException;
> +import org.apache.commons.math.exception.NullArgumentException;
> +import org.apache.commons.math.exception.NumberIsTooLargeException;
> +import org.apache.commons.math.exception.util.Localizable;
> +import org.apache.commons.math.exception.util.LocalizedFormats;
>  
>  /**
>   * Some useful additions to the built-in functions in {@link Math}.
> @@ -91,6 +91,9 @@ public final class MathUtils {
>             1307674368000l,     20922789888000l,     355687428096000l,
>          6402373705728000l, 121645100408832000l, 2432902008176640000l };
>  
> +    /** Factor used for splitting double numbers: n = 2^27 + 1. */
> +    private static final int SPLIT_FACTOR = 0x8000001;
> +
>      /**
>       * Private Constructor
>       */
> @@ -2332,4 +2335,277 @@ public final class MathUtils {
>              throw new NullArgumentException();
>          }
>      }
> +
> +    /**
> +     * Compute a linear combination accurately.
> +     * <p>
> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub> to high accuracy. It does
> +     * so by using specific multiplication and addition algorithms to
> +     * preserve accuracy and reduce cancellation effects. It is based
> +     * on the 2005 paper <a
> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
> +     * Accurate Sum and Dot Product</a> by Takeshi Ogita,
> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
> +     * </p>
> +     * @param a1 first factor of the first term
> +     * @param b1 second factor of the first term
> +     * @param a2 first factor of the second term
> +     * @param b2 second factor of the second term
> +     * @return a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub>
> +     * @see #linearCombination(double, double, double, double, double, double)
> +     * @see #linearCombination(double, double, double, double, double, double, double, double)
> +     */
> +    public static double linearCombination(final double a1, final double b1,
> +                                           final double a2, final double b2) {
> +
> +        // the code below is split in many additions/subtractions that may
> +        // appear redundant. However, they shoud NOT be simplified, as they
> +        // do use IEEE753 floating point arithmetic rouding properties.
> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
> +        // The variables naming conventions are that xyzHigh contains the most significant
> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
> +        // be represented in only one double precision number so we preserve two numbers
> +        // to hold it as long as we can, combining the high and low order bits together
> +        // only at the end, after cancellation may have occurred on high order bits
> +
> +        // split a1 and b1 as two 26 bits numbers
> +        final double ca1        = SPLIT_FACTOR * a1;
> +        final double a1High     = ca1 - (ca1 - a1);
> +        final double a1Low      = a1 - a1High;
> +        final double cb1        = SPLIT_FACTOR * b1;
> +        final double b1High     = cb1 - (cb1 - b1);
> +        final double b1Low      = b1 - b1High;
> +
> +        // accurate multiplication a1 * b1
> +        final double prod1High  = a1 * b1;
> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
> +
> +        // split a2 and b2 as two 26 bits numbers
> +        final double ca2        = SPLIT_FACTOR * a2;
> +        final double a2High     = ca2 - (ca2 - a2);
> +        final double a2Low      = a2 - a2High;
> +        final double cb2        = SPLIT_FACTOR * b2;
> +        final double b2High     = cb2 - (cb2 - b2);
> +        final double b2Low      = b2 - b2High;
> +
> +        // accurate multiplication a2 * b2
> +        final double prod2High  = a2 * b2;
> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
> +
> +        // accurate addition a1 * b1 + a2 * b2
> +        final double s12High    = prod1High + prod2High;
> +        final double s12Prime   = s12High - prod2High;
> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
> +
> +        // final rounding, s12 may have suffered many cancellations, we try
> +        // to recover some bits from the extra words we have saved up to now
> +        return s12High + (prod1Low + prod2Low + s12Low);
> +
> +    }
> +
> +    /**
> +     * Compute a linear combination accurately.
> +     * <p>
> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub> + a<sub>3</sub>&times;b<sub>3</sub>
> +     * to high accuracy. It does so by using specific multiplication and
> +     * addition algorithms to preserve accuracy and reduce cancellation effects.
> +     * It is based on the 2005 paper <a
> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
> +     * Accurate Sum and Dot Product</a> by Takeshi Ogita,
> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
> +     * </p>
> +     * @param a1 first factor of the first term
> +     * @param b1 second factor of the first term
> +     * @param a2 first factor of the second term
> +     * @param b2 second factor of the second term
> +     * @param a3 first factor of the third term
> +     * @param b3 second factor of the third term
> +     * @return a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub> + a<sub>3</sub>&times;b<sub>3</sub>
> +     * @see #linearCombination(double, double, double, double)
> +     * @see #linearCombination(double, double, double, double, double, double, double, double)
> +     */
> +    public static double linearCombination(final double a1, final double b1,
> +                                           final double a2, final double b2,
> +                                           final double a3, final double b3) {
> +
> +        // the code below is split in many additions/subtractions that may
> +        // appear redundant. However, they shoud NOT be simplified, as they
> +        // do use IEEE753 floating point arithmetic rouding properties.
> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
> +        // The variables naming conventions are that xyzHigh contains the most significant
> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
> +        // be represented in only one double precision number so we preserve two numbers
> +        // to hold it as long as we can, combining the high and low order bits together
> +        // only at the end, after cancellation may have occurred on high order bits
> +
> +        // split a1 and b1 as two 26 bits numbers
> +        final double ca1        = SPLIT_FACTOR * a1;
> +        final double a1High     = ca1 - (ca1 - a1);
> +        final double a1Low      = a1 - a1High;
> +        final double cb1        = SPLIT_FACTOR * b1;
> +        final double b1High     = cb1 - (cb1 - b1);
> +        final double b1Low      = b1 - b1High;
> +
> +        // accurate multiplication a1 * b1
> +        final double prod1High  = a1 * b1;
> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
> +
> +        // split a2 and b2 as two 26 bits numbers
> +        final double ca2        = SPLIT_FACTOR * a2;
> +        final double a2High     = ca2 - (ca2 - a2);
> +        final double a2Low      = a2 - a2High;
> +        final double cb2        = SPLIT_FACTOR * b2;
> +        final double b2High     = cb2 - (cb2 - b2);
> +        final double b2Low      = b2 - b2High;
> +
> +        // accurate multiplication a2 * b2
> +        final double prod2High  = a2 * b2;
> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
> +
> +        // split a3 and b3 as two 26 bits numbers
> +        final double ca3        = SPLIT_FACTOR * a3;
> +        final double a3High     = ca3 - (ca3 - a3);
> +        final double a3Low      = a3 - a3High;
> +        final double cb3        = SPLIT_FACTOR * b3;
> +        final double b3High     = cb3 - (cb3 - b3);
> +        final double b3Low      = b3 - b3High;
> +
> +        // accurate multiplication a3 * b3
> +        final double prod3High  = a3 * b3;
> +        final double prod3Low   = a3Low * b3Low - (((prod3High - a3High * b3High) - a3Low * b3High) - a3High * b3Low);
> +
> +        // accurate addition a1 * b1 + a2 * b2
> +        final double s12High    = prod1High + prod2High;
> +        final double s12Prime   = s12High - prod2High;
> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
> +
> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3
> +        final double s123High   = s12High + prod3High;
> +        final double s123Prime  = s123High - prod3High;
> +        final double s123Low    = (prod3High - (s123High - s123Prime)) + (s12High - s123Prime);
> +
> +        // final rounding, s123 may have suffered many cancellations, we try
> +        // to recover some bits from the extra words we have saved up to now
> +        return s123High + (prod1Low + prod2Low + prod3Low + s12Low + s123Low);
> +
> +    }
> +
> +    /**
> +     * Compute a linear combination accurately.
> +     * <p>
> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub> + a<sub>3</sub>&times;b<sub>3</sub> +
> +     * a<sub>4</sub>&times;b<sub>4</sub>
> +     * to high accuracy. It does so by using specific multiplication and
> +     * addition algorithms to preserve accuracy and reduce cancellation effects.
> +     * It is based on the 2005 paper <a
> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
> +     * Accurate Sum and Dot Product</a> by Takeshi Ogita,
> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
> +     * </p>
> +     * @param a1 first factor of the first term
> +     * @param b1 second factor of the first term
> +     * @param a2 first factor of the second term
> +     * @param b2 second factor of the second term
> +     * @param a3 first factor of the third term
> +     * @param b3 second factor of the third term
> +     * @param a4 first factor of the third term
> +     * @param b4 second factor of the third term
> +     * @return a<sub>1</sub>&times;b<sub>1</sub> +
> +     * a<sub>2</sub>&times;b<sub>2</sub> + a<sub>3</sub>&times;b<sub>3</sub> +
> +     * a<sub>4</sub>&times;b<sub>4</sub>
> +     * @see #linearCombination(double, double, double, double)
> +     * @see #linearCombination(double, double, double, double, double, double)
> +     */
> +    public static double linearCombination(final double a1, final double b1,
> +                                           final double a2, final double b2,
> +                                           final double a3, final double b3,
> +                                           final double a4, final double b4) {
> +
> +        // the code below is split in many additions/subtractions that may
> +        // appear redundant. However, they shoud NOT be simplified, as they
> +        // do use IEEE753 floating point arithmetic rouding properties.
> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
> +        // The variables naming conventions are that xyzHigh contains the most significant
> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
> +        // be represented in only one double precision number so we preserve two numbers
> +        // to hold it as long as we can, combining the high and low order bits together
> +        // only at the end, after cancellation may have occurred on high order bits
> +
> +        // split a1 and b1 as two 26 bits numbers
> +        final double ca1        = SPLIT_FACTOR * a1;
> +        final double a1High     = ca1 - (ca1 - a1);
> +        final double a1Low      = a1 - a1High;
> +        final double cb1        = SPLIT_FACTOR * b1;
> +        final double b1High     = cb1 - (cb1 - b1);
> +        final double b1Low      = b1 - b1High;
> +
> +        // accurate multiplication a1 * b1
> +        final double prod1High  = a1 * b1;
> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
> +
> +        // split a2 and b2 as two 26 bits numbers
> +        final double ca2        = SPLIT_FACTOR * a2;
> +        final double a2High     = ca2 - (ca2 - a2);
> +        final double a2Low      = a2 - a2High;
> +        final double cb2        = SPLIT_FACTOR * b2;
> +        final double b2High     = cb2 - (cb2 - b2);
> +        final double b2Low      = b2 - b2High;
> +
> +        // accurate multiplication a2 * b2
> +        final double prod2High  = a2 * b2;
> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
> +
> +        // split a3 and b3 as two 26 bits numbers
> +        final double ca3        = SPLIT_FACTOR * a3;
> +        final double a3High     = ca3 - (ca3 - a3);
> +        final double a3Low      = a3 - a3High;
> +        final double cb3        = SPLIT_FACTOR * b3;
> +        final double b3High     = cb3 - (cb3 - b3);
> +        final double b3Low      = b3 - b3High;
> +
> +        // accurate multiplication a3 * b3
> +        final double prod3High  = a3 * b3;
> +        final double prod3Low   = a3Low * b3Low - (((prod3High - a3High * b3High) - a3Low * b3High) - a3High * b3Low);
> +
> +        // split a4 and b4 as two 26 bits numbers
> +        final double ca4        = SPLIT_FACTOR * a4;
> +        final double a4High     = ca4 - (ca4 - a4);
> +        final double a4Low      = a4 - a4High;
> +        final double cb4        = SPLIT_FACTOR * b4;
> +        final double b4High     = cb4 - (cb4 - b4);
> +        final double b4Low      = b4 - b4High;
> +
> +        // accurate multiplication a4 * b4
> +        final double prod4High  = a4 * b4;
> +        final double prod4Low   = a4Low * b4Low - (((prod4High - a4High * b4High) - a4Low * b4High) - a4High * b4Low);
> +
> +        // accurate addition a1 * b1 + a2 * b2
> +        final double s12High    = prod1High + prod2High;
> +        final double s12Prime   = s12High - prod2High;
> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
> +
> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3
> +        final double s123High   = s12High + prod3High;
> +        final double s123Prime  = s123High - prod3High;
> +        final double s123Low    = (prod3High - (s123High - s123Prime)) + (s12High - s123Prime);
> +
> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3 + a4 * b4
> +        final double s1234High  = s123High + prod4High;
> +        final double s1234Prime = s1234High - prod4High;
> +        final double s1234Low   = (prod4High - (s1234High - s1234Prime)) + (s123High - s1234Prime);
> +
> +        // final rounding, s1234 may have suffered many cancellations, we try
> +        // to recover some bits from the extra words we have saved up to now
> +        return s1234High + (prod1Low + prod2Low + prod3Low + prod4Low + s12Low + s123Low + s1234Low);
> +
> +    }
> +
>  }
>
> Modified: commons/proper/math/trunk/src/site/xdoc/changes.xml
> URL: http://svn.apache.org/viewvc/commons/proper/math/trunk/src/site/xdoc/changes.xml?rev=1154250&r1=1154249&r2=1154250&view=diff
> ==============================================================================
> --- commons/proper/math/trunk/src/site/xdoc/changes.xml (original)
> +++ commons/proper/math/trunk/src/site/xdoc/changes.xml Fri Aug  5 15:01:49 2011
> @@ -52,6 +52,13 @@ The <action> type attribute can be add,u
>      If the output is not quite correct, check for invisible trailing spaces!
>       -->
>      <release version="3.0" date="TBD" description="TBD">
> +      <action dev="luc" type="add" >
> +        Added a few linearCombination utility methods in MathUtils to compute accurately
> +        linear combinations a1.b1 + a2.b2 + ... + an.bn taking great care to compensate
> +        for cancellation effects. This both improves and simplify several methods in
> +        euclidean geometry classes, including linear constructors, dot product and cross
> +        product.
> +      </action>
>        <action dev="psteitz" type="fix" issue="MATH-640">
>          Fixed bugs in AbstractRandomGenerator nextInt() and nextLong() default
>          implementations.  Prior to the fix for this issue, these methods
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] linearCombination

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
Hi.

> [...]
> >
> >However, I'm concerned that the "stress" test ("testLinearCombination1")
> >taken from your test case for Vector3D might not be stringent enough. [I
> >think that while changing my implemementation, the test was still passing
> >just before I dicovered that I was using an unitialized (set to 0 by
> >default) variable.]
> >Could you devise one sure to detect bugs like this (i.e. where all the
> >various variables in "linearCombination" would be significant)?
> 
> I think we could try something like generating randomly the first
> n-1 components of a and b arrays as high accuracy Dfp, then set up
> the last component of each array so that the higher bits cancels out
> and some non-zero lower bits remains. The Dfp computations would
> represent reference values and we would check if the method can
> retrieve at least some of the lower bits. With a large number of
> different arrays (say a few hundreds, with a few dozen elements
> each), this would stress the algorithm.
> 
> Does this make sense ?

Maybe, but I can't look at Dfp for now. So I'll let you try ;-).


Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] linearCombination

Posted by Luc Maisonobe <Lu...@free.fr>.
Le 06/08/2011 13:15, Gilles Sadowski a écrit :
> Hi.
>
>>> [...]
>>
>> I've allowed myself to adapt your code to an array version of the
>> accurate "linearCombination" (committed in revision 1154416).

Thanks Gilles,

I wanted to add this too and simply forgot :-(

>>
>> Here is a micro-benchmark (for an array of length 3):
>> -----
>> linearCombination (runs per timed block: 10000, timed blocks: 1000)
>>    direct inline: 1.789185e-05 (1.580520e-05) ms
>>     direct array: 2.009243e-05 (1.257636e-05) ms
>> accurate inline: 4.244761e-05 (5.882379e-05) ms
>>   accurate array: 9.365895e-05 (6.902221e-05) ms
>> -----
>
> Mre efficient implementation in revision 1154485:
> -----
> linearCombination (runs per timed block: 10000, timed blocks: 1000)
>    direct inline: 1.899314e-05 (5.587518e-05) ms
>     direct array: 1.918198e-05 (1.133822e-05) ms
> accurate inline: 4.178593e-05 (5.697845e-05) ms
>   accurate array: 7.144897e-05 (3.897243e-05) ms
> -----
>
> However, I'm concerned that the "stress" test ("testLinearCombination1")
> taken from your test case for Vector3D might not be stringent enough. [I
> think that while changing my implemementation, the test was still passing
> just before I dicovered that I was using an unitialized (set to 0 by
> default) variable.]
> Could you devise one sure to detect bugs like this (i.e. where all the
> various variables in "linearCombination" would be significant)?

I think we could try something like generating randomly the first n-1 
components of a and b arrays as high accuracy Dfp, then set up the last 
component of each array so that the higher bits cancels out and some 
non-zero lower bits remains. The Dfp computations would represent 
reference values and we would check if the method can retrieve at least 
some of the lower bits. With a large number of different arrays (say a 
few hundreds, with a few dozen elements each), this would stress the 
algorithm.

Does this make sense ?

Luc

>
>
> Best,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] linearCombination

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
Hi.
 
> > [...]
> 
> I've allowed myself to adapt your code to an array version of the
> accurate "linearCombination" (committed in revision 1154416).
> 
> Here is a micro-benchmark (for an array of length 3):
> -----
> linearCombination (runs per timed block: 10000, timed blocks: 1000)
>   direct inline: 1.789185e-05 (1.580520e-05) ms
>    direct array: 2.009243e-05 (1.257636e-05) ms
> accurate inline: 4.244761e-05 (5.882379e-05) ms
>  accurate array: 9.365895e-05 (6.902221e-05) ms
> -----

Mre efficient implementation in revision 1154485:
-----
linearCombination (runs per timed block: 10000, timed blocks: 1000)
  direct inline: 1.899314e-05 (5.587518e-05) ms
   direct array: 1.918198e-05 (1.133822e-05) ms
accurate inline: 4.178593e-05 (5.697845e-05) ms
 accurate array: 7.144897e-05 (3.897243e-05) ms
-----

However, I'm concerned that the "stress" test ("testLinearCombination1")
taken from your test case for Vector3D might not be stringent enough. [I
think that while changing my implemementation, the test was still passing
just before I dicovered that I was using an unitialized (set to 0 by
default) variable.]
Could you devise one sure to detect bugs like this (i.e. where all the
various variables in "linearCombination" would be significant)?


Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[Math] linearCombination (Was: Re: svn commit: r1154250 - [...])

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
> [...]

I've allowed myself to adapt your code to an array version of the
accurate "linearCombination" (committed in revision 1154416).

Here is a micro-benchmark (for an array of length 3):
-----
linearCombination (runs per timed block: 10000, timed blocks: 1000)
  direct inline: 1.789185e-05 (1.580520e-05) ms
   direct array: 2.009243e-05 (1.257636e-05) ms
accurate inline: 4.244761e-05 (5.882379e-05) ms
 accurate array: 9.365895e-05 (6.902221e-05) ms
-----


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: svn commit: r1154250 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math/util/MathUtils.java site/xdoc/changes.xml

Posted by Luc Maisonobe <Lu...@free.fr>.
Le 05/08/2011 20:48, Phil Steitz a écrit :
> I am +1 on the idea here and am wondering if we should think a
> little more radically even.  Greg has - correctly, IMO - complained
> that the current structure of the linear package makes it awkward to
> implement optimized operations for some classes of matrices and
> operations.  I a wonder whether it might not make sense to
> encapsulate high-performance, array-based implementations of core
> operations in utils classes that can be used like the methods
> below.  I remember back in the very early days of [math] suggesting
> this for statistical algorithms and being talked out of it in favor
> of a more classical OO approach.  I wonder if by organizing things
> correctly, we might not be able to have the best of both worlds.

Well, I didn't think so far. These methods have been implemented has a 
side effect of fixing a bug, simply trying to do this so the same kind 
of algorithms could be useful elsewhere.

OO and big bulk loops should not be seen as opposite IMHO. They can both 
be useful and are rather complementary to each other. There *are* cases 
where a long array-based computation is the best approach, so I am 
clearly +1 on this. I even sometimes wonder about having some parts of 
the code generated by macro-like preprocessing, in order to have a more 
compact source code that is automatically expanded to a straightforward 
sequence that the compiler optimizes very efficiently.

There are also two different goals to keep: fast code on one side, and 
accurate code on the other side. This specific commit improved accuracy 
(but is still quite fast, as there are no branches at all). I think we 
need both.

Luc

>
> Phil
>
> On 8/5/11 8:01 AM, luc@apache.org wrote:
>> Author: luc
>> Date: Fri Aug  5 15:01:49 2011
>> New Revision: 1154250
>>
>> URL: http://svn.apache.org/viewvc?rev=1154250&view=rev
>> Log:
>> Added a few linearCombination utility methods in MathUtils to compute accurately
>> linear combinations a1.b1 + a2.b2 + ... + an.bn taking great care to compensate
>> for cancellation effects. This both improves and simplify several methods in
>> euclidean geometry classes, including linear constructors, dot product and cross
>> product.
>>
>> Modified:
>>      commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java
>>      commons/proper/math/trunk/src/site/xdoc/changes.xml
>>
>> Modified: commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java
>> URL: http://svn.apache.org/viewvc/commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java?rev=1154250&r1=1154249&r2=1154250&view=diff
>> ==============================================================================
>> --- commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java (original)
>> +++ commons/proper/math/trunk/src/main/java/org/apache/commons/math/util/MathUtils.java Fri Aug  5 15:01:49 2011
>> @@ -19,22 +19,22 @@ package org.apache.commons.math.util;
>>
>>   import java.math.BigDecimal;
>>   import java.math.BigInteger;
>> -import java.util.Arrays;
>> -import java.util.List;
>>   import java.util.ArrayList;
>> -import java.util.Comparator;
>> +import java.util.Arrays;
>>   import java.util.Collections;
>> +import java.util.Comparator;
>> +import java.util.List;
>>
>> -import org.apache.commons.math.exception.util.Localizable;
>> -import org.apache.commons.math.exception.util.LocalizedFormats;
>> -import org.apache.commons.math.exception.NonMonotonousSequenceException;
>>   import org.apache.commons.math.exception.DimensionMismatchException;
>> -import org.apache.commons.math.exception.NullArgumentException;
>> -import org.apache.commons.math.exception.NotPositiveException;
>>   import org.apache.commons.math.exception.MathArithmeticException;
>>   import org.apache.commons.math.exception.MathIllegalArgumentException;
>> -import org.apache.commons.math.exception.NumberIsTooLargeException;
>> +import org.apache.commons.math.exception.NonMonotonousSequenceException;
>>   import org.apache.commons.math.exception.NotFiniteNumberException;
>> +import org.apache.commons.math.exception.NotPositiveException;
>> +import org.apache.commons.math.exception.NullArgumentException;
>> +import org.apache.commons.math.exception.NumberIsTooLargeException;
>> +import org.apache.commons.math.exception.util.Localizable;
>> +import org.apache.commons.math.exception.util.LocalizedFormats;
>>
>>   /**
>>    * Some useful additions to the built-in functions in {@link Math}.
>> @@ -91,6 +91,9 @@ public final class MathUtils {
>>              1307674368000l,     20922789888000l,     355687428096000l,
>>           6402373705728000l, 121645100408832000l, 2432902008176640000l };
>>
>> +    /** Factor used for splitting double numbers: n = 2^27 + 1. */
>> +    private static final int SPLIT_FACTOR = 0x8000001;
>> +
>>       /**
>>        * Private Constructor
>>        */
>> @@ -2332,4 +2335,277 @@ public final class MathUtils {
>>               throw new NullArgumentException();
>>           }
>>       }
>> +
>> +    /**
>> +     * Compute a linear combination accurately.
>> +     *<p>
>> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>  to high accuracy. It does
>> +     * so by using specific multiplication and addition algorithms to
>> +     * preserve accuracy and reduce cancellation effects. It is based
>> +     * on the 2005 paper<a
>> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
>> +     * Accurate Sum and Dot Product</a>  by Takeshi Ogita,
>> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
>> +     *</p>
>> +     * @param a1 first factor of the first term
>> +     * @param b1 second factor of the first term
>> +     * @param a2 first factor of the second term
>> +     * @param b2 second factor of the second term
>> +     * @return a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>
>> +     * @see #linearCombination(double, double, double, double, double, double)
>> +     * @see #linearCombination(double, double, double, double, double, double, double, double)
>> +     */
>> +    public static double linearCombination(final double a1, final double b1,
>> +                                           final double a2, final double b2) {
>> +
>> +        // the code below is split in many additions/subtractions that may
>> +        // appear redundant. However, they shoud NOT be simplified, as they
>> +        // do use IEEE753 floating point arithmetic rouding properties.
>> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
>> +        // The variables naming conventions are that xyzHigh contains the most significant
>> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
>> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
>> +        // be represented in only one double precision number so we preserve two numbers
>> +        // to hold it as long as we can, combining the high and low order bits together
>> +        // only at the end, after cancellation may have occurred on high order bits
>> +
>> +        // split a1 and b1 as two 26 bits numbers
>> +        final double ca1        = SPLIT_FACTOR * a1;
>> +        final double a1High     = ca1 - (ca1 - a1);
>> +        final double a1Low      = a1 - a1High;
>> +        final double cb1        = SPLIT_FACTOR * b1;
>> +        final double b1High     = cb1 - (cb1 - b1);
>> +        final double b1Low      = b1 - b1High;
>> +
>> +        // accurate multiplication a1 * b1
>> +        final double prod1High  = a1 * b1;
>> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
>> +
>> +        // split a2 and b2 as two 26 bits numbers
>> +        final double ca2        = SPLIT_FACTOR * a2;
>> +        final double a2High     = ca2 - (ca2 - a2);
>> +        final double a2Low      = a2 - a2High;
>> +        final double cb2        = SPLIT_FACTOR * b2;
>> +        final double b2High     = cb2 - (cb2 - b2);
>> +        final double b2Low      = b2 - b2High;
>> +
>> +        // accurate multiplication a2 * b2
>> +        final double prod2High  = a2 * b2;
>> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2
>> +        final double s12High    = prod1High + prod2High;
>> +        final double s12Prime   = s12High - prod2High;
>> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
>> +
>> +        // final rounding, s12 may have suffered many cancellations, we try
>> +        // to recover some bits from the extra words we have saved up to now
>> +        return s12High + (prod1Low + prod2Low + s12Low);
>> +
>> +    }
>> +
>> +    /**
>> +     * Compute a linear combination accurately.
>> +     *<p>
>> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>  + a<sub>3</sub>&times;b<sub>3</sub>
>> +     * to high accuracy. It does so by using specific multiplication and
>> +     * addition algorithms to preserve accuracy and reduce cancellation effects.
>> +     * It is based on the 2005 paper<a
>> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
>> +     * Accurate Sum and Dot Product</a>  by Takeshi Ogita,
>> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
>> +     *</p>
>> +     * @param a1 first factor of the first term
>> +     * @param b1 second factor of the first term
>> +     * @param a2 first factor of the second term
>> +     * @param b2 second factor of the second term
>> +     * @param a3 first factor of the third term
>> +     * @param b3 second factor of the third term
>> +     * @return a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>  + a<sub>3</sub>&times;b<sub>3</sub>
>> +     * @see #linearCombination(double, double, double, double)
>> +     * @see #linearCombination(double, double, double, double, double, double, double, double)
>> +     */
>> +    public static double linearCombination(final double a1, final double b1,
>> +                                           final double a2, final double b2,
>> +                                           final double a3, final double b3) {
>> +
>> +        // the code below is split in many additions/subtractions that may
>> +        // appear redundant. However, they shoud NOT be simplified, as they
>> +        // do use IEEE753 floating point arithmetic rouding properties.
>> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
>> +        // The variables naming conventions are that xyzHigh contains the most significant
>> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
>> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
>> +        // be represented in only one double precision number so we preserve two numbers
>> +        // to hold it as long as we can, combining the high and low order bits together
>> +        // only at the end, after cancellation may have occurred on high order bits
>> +
>> +        // split a1 and b1 as two 26 bits numbers
>> +        final double ca1        = SPLIT_FACTOR * a1;
>> +        final double a1High     = ca1 - (ca1 - a1);
>> +        final double a1Low      = a1 - a1High;
>> +        final double cb1        = SPLIT_FACTOR * b1;
>> +        final double b1High     = cb1 - (cb1 - b1);
>> +        final double b1Low      = b1 - b1High;
>> +
>> +        // accurate multiplication a1 * b1
>> +        final double prod1High  = a1 * b1;
>> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
>> +
>> +        // split a2 and b2 as two 26 bits numbers
>> +        final double ca2        = SPLIT_FACTOR * a2;
>> +        final double a2High     = ca2 - (ca2 - a2);
>> +        final double a2Low      = a2 - a2High;
>> +        final double cb2        = SPLIT_FACTOR * b2;
>> +        final double b2High     = cb2 - (cb2 - b2);
>> +        final double b2Low      = b2 - b2High;
>> +
>> +        // accurate multiplication a2 * b2
>> +        final double prod2High  = a2 * b2;
>> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
>> +
>> +        // split a3 and b3 as two 26 bits numbers
>> +        final double ca3        = SPLIT_FACTOR * a3;
>> +        final double a3High     = ca3 - (ca3 - a3);
>> +        final double a3Low      = a3 - a3High;
>> +        final double cb3        = SPLIT_FACTOR * b3;
>> +        final double b3High     = cb3 - (cb3 - b3);
>> +        final double b3Low      = b3 - b3High;
>> +
>> +        // accurate multiplication a3 * b3
>> +        final double prod3High  = a3 * b3;
>> +        final double prod3Low   = a3Low * b3Low - (((prod3High - a3High * b3High) - a3Low * b3High) - a3High * b3Low);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2
>> +        final double s12High    = prod1High + prod2High;
>> +        final double s12Prime   = s12High - prod2High;
>> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3
>> +        final double s123High   = s12High + prod3High;
>> +        final double s123Prime  = s123High - prod3High;
>> +        final double s123Low    = (prod3High - (s123High - s123Prime)) + (s12High - s123Prime);
>> +
>> +        // final rounding, s123 may have suffered many cancellations, we try
>> +        // to recover some bits from the extra words we have saved up to now
>> +        return s123High + (prod1Low + prod2Low + prod3Low + s12Low + s123Low);
>> +
>> +    }
>> +
>> +    /**
>> +     * Compute a linear combination accurately.
>> +     *<p>
>> +     * This method computes a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>  + a<sub>3</sub>&times;b<sub>3</sub>  +
>> +     * a<sub>4</sub>&times;b<sub>4</sub>
>> +     * to high accuracy. It does so by using specific multiplication and
>> +     * addition algorithms to preserve accuracy and reduce cancellation effects.
>> +     * It is based on the 2005 paper<a
>> +     * href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547">
>> +     * Accurate Sum and Dot Product</a>  by Takeshi Ogita,
>> +     * Siegfried M. Rump, and Shin'ichi Oishi published in SIAM J. Sci. Comput.
>> +     *</p>
>> +     * @param a1 first factor of the first term
>> +     * @param b1 second factor of the first term
>> +     * @param a2 first factor of the second term
>> +     * @param b2 second factor of the second term
>> +     * @param a3 first factor of the third term
>> +     * @param b3 second factor of the third term
>> +     * @param a4 first factor of the third term
>> +     * @param b4 second factor of the third term
>> +     * @return a<sub>1</sub>&times;b<sub>1</sub>  +
>> +     * a<sub>2</sub>&times;b<sub>2</sub>  + a<sub>3</sub>&times;b<sub>3</sub>  +
>> +     * a<sub>4</sub>&times;b<sub>4</sub>
>> +     * @see #linearCombination(double, double, double, double)
>> +     * @see #linearCombination(double, double, double, double, double, double)
>> +     */
>> +    public static double linearCombination(final double a1, final double b1,
>> +                                           final double a2, final double b2,
>> +                                           final double a3, final double b3,
>> +                                           final double a4, final double b4) {
>> +
>> +        // the code below is split in many additions/subtractions that may
>> +        // appear redundant. However, they shoud NOT be simplified, as they
>> +        // do use IEEE753 floating point arithmetic rouding properties.
>> +        // as an example, the expression "ca1 - (ca1 - a1)" is NOT the same as "a1"
>> +        // The variables naming conventions are that xyzHigh contains the most significant
>> +        // bits of xyz and xyzLow contains its least significant bits. So theoretically
>> +        // xyz is the sum xyzHigh + xyzLow, but in many cases below, this sum cannot
>> +        // be represented in only one double precision number so we preserve two numbers
>> +        // to hold it as long as we can, combining the high and low order bits together
>> +        // only at the end, after cancellation may have occurred on high order bits
>> +
>> +        // split a1 and b1 as two 26 bits numbers
>> +        final double ca1        = SPLIT_FACTOR * a1;
>> +        final double a1High     = ca1 - (ca1 - a1);
>> +        final double a1Low      = a1 - a1High;
>> +        final double cb1        = SPLIT_FACTOR * b1;
>> +        final double b1High     = cb1 - (cb1 - b1);
>> +        final double b1Low      = b1 - b1High;
>> +
>> +        // accurate multiplication a1 * b1
>> +        final double prod1High  = a1 * b1;
>> +        final double prod1Low   = a1Low * b1Low - (((prod1High - a1High * b1High) - a1Low * b1High) - a1High * b1Low);
>> +
>> +        // split a2 and b2 as two 26 bits numbers
>> +        final double ca2        = SPLIT_FACTOR * a2;
>> +        final double a2High     = ca2 - (ca2 - a2);
>> +        final double a2Low      = a2 - a2High;
>> +        final double cb2        = SPLIT_FACTOR * b2;
>> +        final double b2High     = cb2 - (cb2 - b2);
>> +        final double b2Low      = b2 - b2High;
>> +
>> +        // accurate multiplication a2 * b2
>> +        final double prod2High  = a2 * b2;
>> +        final double prod2Low   = a2Low * b2Low - (((prod2High - a2High * b2High) - a2Low * b2High) - a2High * b2Low);
>> +
>> +        // split a3 and b3 as two 26 bits numbers
>> +        final double ca3        = SPLIT_FACTOR * a3;
>> +        final double a3High     = ca3 - (ca3 - a3);
>> +        final double a3Low      = a3 - a3High;
>> +        final double cb3        = SPLIT_FACTOR * b3;
>> +        final double b3High     = cb3 - (cb3 - b3);
>> +        final double b3Low      = b3 - b3High;
>> +
>> +        // accurate multiplication a3 * b3
>> +        final double prod3High  = a3 * b3;
>> +        final double prod3Low   = a3Low * b3Low - (((prod3High - a3High * b3High) - a3Low * b3High) - a3High * b3Low);
>> +
>> +        // split a4 and b4 as two 26 bits numbers
>> +        final double ca4        = SPLIT_FACTOR * a4;
>> +        final double a4High     = ca4 - (ca4 - a4);
>> +        final double a4Low      = a4 - a4High;
>> +        final double cb4        = SPLIT_FACTOR * b4;
>> +        final double b4High     = cb4 - (cb4 - b4);
>> +        final double b4Low      = b4 - b4High;
>> +
>> +        // accurate multiplication a4 * b4
>> +        final double prod4High  = a4 * b4;
>> +        final double prod4Low   = a4Low * b4Low - (((prod4High - a4High * b4High) - a4Low * b4High) - a4High * b4Low);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2
>> +        final double s12High    = prod1High + prod2High;
>> +        final double s12Prime   = s12High - prod2High;
>> +        final double s12Low     = (prod2High - (s12High - s12Prime)) + (prod1High - s12Prime);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3
>> +        final double s123High   = s12High + prod3High;
>> +        final double s123Prime  = s123High - prod3High;
>> +        final double s123Low    = (prod3High - (s123High - s123Prime)) + (s12High - s123Prime);
>> +
>> +        // accurate addition a1 * b1 + a2 * b2 + a3 * b3 + a4 * b4
>> +        final double s1234High  = s123High + prod4High;
>> +        final double s1234Prime = s1234High - prod4High;
>> +        final double s1234Low   = (prod4High - (s1234High - s1234Prime)) + (s123High - s1234Prime);
>> +
>> +        // final rounding, s1234 may have suffered many cancellations, we try
>> +        // to recover some bits from the extra words we have saved up to now
>> +        return s1234High + (prod1Low + prod2Low + prod3Low + prod4Low + s12Low + s123Low + s1234Low);
>> +
>> +    }
>> +
>>   }
>>
>> Modified: commons/proper/math/trunk/src/site/xdoc/changes.xml
>> URL: http://svn.apache.org/viewvc/commons/proper/math/trunk/src/site/xdoc/changes.xml?rev=1154250&r1=1154249&r2=1154250&view=diff
>> ==============================================================================
>> --- commons/proper/math/trunk/src/site/xdoc/changes.xml (original)
>> +++ commons/proper/math/trunk/src/site/xdoc/changes.xml Fri Aug  5 15:01:49 2011
>> @@ -52,6 +52,13 @@ The<action>  type attribute can be add,u
>>       If the output is not quite correct, check for invisible trailing spaces!
>>        -->
>>       <release version="3.0" date="TBD" description="TBD">
>> +<action dev="luc" type="add">
>> +        Added a few linearCombination utility methods in MathUtils to compute accurately
>> +        linear combinations a1.b1 + a2.b2 + ... + an.bn taking great care to compensate
>> +        for cancellation effects. This both improves and simplify several methods in
>> +        euclidean geometry classes, including linear constructors, dot product and cross
>> +        product.
>> +</action>
>>         <action dev="psteitz" type="fix" issue="MATH-640">
>>           Fixed bugs in AbstractRandomGenerator nextInt() and nextLong() default
>>           implementations.  Prior to the fix for this issue, these methods
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org