You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Peng Cheng <pc...@uowmail.edu.au> on 2013/09/18 23:16:47 UTC

Why Kahan summation was not used anywhere?

For a large scale computational engine this seems unwashed. Most 
summation/average and dot product of vectors still use naive summation 
despite of its O(n) error.

Is there a reason?

All the best,
Yours Peng

Re: Why Kahan summation was not used anywhere?

Posted by Ted Dunning <te...@gmail.com>.

This has come up before.

As a background, if you add up lots of numbers using a straightforward
loop, you can lose precision.  In the worse case the loss is O(n \epsilon),
but in virtually all real examples the lossage is O(\epsilon \sqrt(n)).  IF
we are summing a billion numbers, the square root is ~10^5 so we can
potentially lose 5 sig figs (out of 17 available with double precision).

Kahan summation increases the number of floating point operations by 4x,
but using a clever trick and manages to retain most of the bits that would
otherwise be lost.  Shewchuk summation uses divide and conquer to limit the
lossage with O(log n) storage and no increase in the number of flops.

There are several cases to consider:

1) online algorithms such as OnlineSummarizer.

2) dot product and friends.

3) general matrix decompositions

In the first case, we can often have millions or even billions of numbers
to analyze. that said, however, the input data is typically quite noisy and
signal to noise ratios > 100 are actually kind of rare in Mahout
applications.  Modified Shewchuk estimation (see below for details) could
decrease summation error from a few parts in 10^12 to less than 1 part in
10^12 at minimal cost.  These errors are 10^10 smaller than the noise in
our data so this seems not useful.

In the second case, we almost always are summing products of sparse
vectors.  Having thousands of non-zero elements is common but millions of
non-zeros are quite rare.  Billions of non-zeros are unheard of.  This
means that the errors are going to be trivial.

In the third case, we often have dense matrices, but the sizes are
typically on the order of 100 x 100 or less.  This makes the errors even
smaller than our common dot products.

To me, this seems to say that this isn't worth doing.  I am happy to be
corrected if you have counter evidence.

Note that BLAS does naive summation and none of the Mahout operations are
implemented using anything except double precision floating point.

Here is an experiment that tests to see how big the problem really is:

    @Test
    public void runKahanSum() {
        Random gen = RandomUtils.getRandom();

        double ksum = 0;                 // Kahan sum
        double c = 0;                    // low order bits for Kahan sum
        double sum = 0;                  // naive sum
        double[] vsum = new double[16];  // 8 way decomposed sum
        for (int i = 0; i < 1e9; i++) {
            double x = gen.nextDouble();

            double y = x - c;
            double t = ksum + y;
            c = (t - ksum) - y;
            ksum = t;

            sum += x;

            vsum[i % 16] += x;
        }

        // now add up the decomposed pieces
        double zsum = 0;
        for (int i = 0; i < vsum.length; i++) {
            zsum += vsum[i];
        }
        System.out.printf("%.4f %.4f %.4f\n", ksum, 1e12 * (sum - ksum) /
ksum, 1e12 * (zsum - ksum) / ksum);
    }

A typical result here is that naive summation gives results that are
accurate to within 1 part in 10^12, 8 way summation manages < 0.05 parts in
10^12 and 16 way summation is only slightly better than 8 way summation.

If the random numbers being summed are changed to have a mean of zero, then
the relative error increases to 1.7 parts in 10^12 and 0.3 parts in 10^12,
but the absolute error is much smaller.

Generally, it doesn't make sense to do the accumulation in float's because
these operations are almost always memory channel bound rather than CPU
bound.  Changing to floating point arithmetic in spite of this decreases
the accuracy to about 500 parts per million, 200 parts per million
respectively for naive summation and 8 way summation

On Wed, Sep 18, 2013 at 2:16 PM, Peng Cheng <pc...@uowmail.edu.au> wrote:

> For a large scale computational engine this seems unwashed. Most
> summation/average and dot product of vectors still use naive summation
> despite of its O(n) error.
>
> Is there a reason?
>
> All the best,
> Yours Peng
>
>