You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Matt Juntunen <ma...@hotmail.com> on 2021/04/19 13:20:01 UTC

[geometry] 1.0-beta2

I'd like to release commons-geometry 1.0-beta2 within the next couple of weeks. I'm planning on including GEOMETRY-118 (additional methods for transform matrix classes) in that if possible. Is there anything else anyone would like to include?

Regards,
Matt J

Re: [geometry] 1.0-beta2

Posted by Alex Herbert <al...@gmail.com>.
On Wed, 21 Apr 2021 at 03:33, Matt Juntunen <ma...@hotmail.com>
wrote:

> Hi Alex,
>
> First of all, thanks for all of the work you've done here! Second, I vote
> to go with the fast dot2s implementation for LinearCombination and use it
> as a static method in commons-geometry. The reason is that the scale of
> accuracy improvements we're talking about here is not going to mean much
> for practical purposes, whereas a performance hit in a critical piece of
> code like this definitely is. Also, I'd prefer to keep the choice of
> LinearCombination algorithm internal to the library and not make it
> configurable. The vast majority of users are not going to want to have to
> think about it and it will keep the API cleaner if we don't explicitly
> allow it to be changed. If users want to use a different algorithm for
> something in commons-geometry, they can write a utility method that
> leverages other parts of commons-numbers.
>

OK. Note that exact computation is most important when small absolute
differences result in large relative differences between the computed and
actual answer. The case for the Complex class was in a computation that
approaches zero. Here the relative error can be very large. If this type of
computation is not being done in Geometry then favouring speed over
precision makes sense. If a critical computation requires this precision
then support can be added for specific use cases when they are found.

I'll create a PR that updates the current method to the faster (and
essentially the same accuracy) dot2s method.

Alex

Re: [geometry] 1.0-beta2

Posted by Matt Juntunen <ma...@hotmail.com>.
Hi Alex,

First of all, thanks for all of the work you've done here! Second, I vote to go with the fast dot2s implementation for LinearCombination and use it as a static method in commons-geometry. The reason is that the scale of accuracy improvements we're talking about here is not going to mean much for practical purposes, whereas a performance hit in a critical piece of code like this definitely is. Also, I'd prefer to keep the choice of LinearCombination algorithm internal to the library and not make it configurable. The vast majority of users are not going to want to have to think about it and it will keep the API cleaner if we don't explicitly allow it to be changed. If users want to use a different algorithm for something in commons-geometry, they can write a utility method that leverages other parts of commons-numbers.

Regards,
Matt J
________________________________
From: Alex Herbert <al...@gmail.com>
Sent: Tuesday, April 20, 2021 12:25 PM
To: Commons Developers List <de...@commons.apache.org>
Subject: Re: [geometry] 1.0-beta2

On Tue, 20 Apr 2021 at 13:51, Gilles Sadowski <gi...@gmail.com> wrote:

> Hi Alex.
>
> Le mar. 20 avr. 2021 à 11:17, Alex Herbert <al...@gmail.com> a
> écrit :
> >
> > [...]
>
> I'm a bit lost in all these bits... ;-)
>

I also had to remind myself of the work since it was a long time ago.


>
> > Any opinions on how changing LinearComination may affect Geometry? Either
> > we clean up the implementation to use the fast dot2s algorithm with
> correct
> > support for splitting 53-bit mantissas, or switch to the extended
> precision
> > version. But I do not think we should leave it as the current
> > implementation which has disadvantages against either of the
> alternatives.
>
> What is your suggestion?
>

Some background...

The dot2s is a specialisation of the DotK algorithm where the K represents
the k-fold increase in precision over a standard scalar product and denotes
qualitatively the level of robustness for a sum with massive cancellation
(e.g. 1e-100 + 1e200 - 2e-100 - 1e200 = -1e-100). Any floating-point sum is
limited by the representation of a 64-bit double as a binary floating point
number with only 53-bits of precision and an exponent.  When adding numbers
the result is computed and any extra bits that cannot fit into the mantissa
are lost. Whether this occurs is dependent on the input numbers, but in
general as the difference in magnitude becomes greater then the result will
contain progressively less of the true result due to the 53-bit mantissa.
If two numbers are different by more than 2^53 then adding them makes no
difference to the bigger number. In the example above a standard precision
result is 0. Only with extended precision can you hold the result of 1e-100
+ 1e200, etc.

In a sum of magnitude this only really matters when cancellation (opposite
signs) may occur. Take the recently discussed code to compute the 3D vector
length:

len = sqrt(x^2 + y^2 + z^2)

This uses a sum where the sign of the terms is the same. There will be no
cancellation and thus the answer will be close to the true result (within a
few ulps).

But if the sum of multiple values has different signs then the answer may
be in between the magnitudes of the terms. So the intermediate sum should
store knowledge of more bits of the intermediate result than the 64-bits of
a double.

The LinearCombination is actually providing two operations, multiplication
and addition:

sum_i {a_i * b_i}

The multiplication has a potential 106-bit mantissa for the result with a
single exponent. Any addition may require a longer mantissa with a single
exponent which is how BigDecimal would store the result. An extended
precision double number would represent the result as an array of two
doubles of very different magnitudes. Adding or multiplying the extended
precision result can increase the length of the result further. The use of
arrays of double to represent a number in extended precision is well
described in Shewchuk (1997): Arbitrary Precision Floating-Point Arithmetic
[1], on which a lot of the work in Numbers-142 is based.

There could be a case for a class to implement this generically:

public class ExtendedDouble {
  public static ExtendedDouble of(double a);
  public ExtendedDouble add(double a);
  public ExtendedDouble add(ExtendedDouble a);
  public ExtendedDouble multiply(double a);
  public ExtendedDouble multiply(ExtendedDouble a);
  public double toDouble();
}

That could be in another module in numbers. Shewchuk does not describe
division but it may be in references he provides. His paper is based upon
developing exact arithmetic for geometry applications so addition and
multiplication are all that are used.

As for LinearCombination then I would suggest that the use case is for any
dot product where cancellation may be expected. If handling cancellation is
more important than speed then using the most accurate method would seem to
be a better choice.

I do recall seeing LinearCombination used in the Geometry code inside a
loop with the result stored as a double on each iteration. This would
result in loss of the extended precision intermediate sum and so could
cause error due to cancellation. So there is a case for going through
Geometry to find use cases for LinearCombination and potentially find
places where changes should be made to compute all the terms in advance and
call LinearCombination once with the two input arrays.


> My impression is that [Geometry] emphasizes accuracy over ultimate
> speed (as opposed to libraries used for real-time rendering, I guess).
>
> However, could it be possible to leave this as a user's decision?
> Quoting from Matt's tutorial:
> ---CUT---
> Typically, users of Commons Geometry will construct a single instance
> of this type for use by multiple objects throughout an entire
> operation, or even application. Since we don't want our class to
> assume such a heavy responsibility, we will simply accept a
> DoublePrecisionContext in the constructor.
> ---CUT---
> Would it be conceivable that the choice of the implementation
> activated by a call to the "LinearCombination" facility is also
> encapsulated in the "DoublePrecisionContext"?
>

Numbers-142 did discuss the change of LinearCombination to an interface.
Then we already have several implementations for the interface where a user
can choose the balance between speed and robustness to cancellation.

It would mean reworking Geometry at all places that use LinearCombination
to use an instance rather than the current static method call. There would
be a default instance or a user supplied one.

I would not opt for using a singleton to define the behaviour of the static
method. This would work for a standalone application using Geometry but
would have issues for any application where different code requires a
different precision level.


[1]
http://www-2.cs.cmu.edu/afs/cs/project/quake/public/papers/robust-arithmetic.ps

Re: [geometry] 1.0-beta2

Posted by Alex Herbert <al...@gmail.com>.
On Tue, 20 Apr 2021 at 13:51, Gilles Sadowski <gi...@gmail.com> wrote:

> Hi Alex.
>
> Le mar. 20 avr. 2021 à 11:17, Alex Herbert <al...@gmail.com> a
> écrit :
> >
> > [...]
>
> I'm a bit lost in all these bits... ;-)
>

I also had to remind myself of the work since it was a long time ago.


>
> > Any opinions on how changing LinearComination may affect Geometry? Either
> > we clean up the implementation to use the fast dot2s algorithm with
> correct
> > support for splitting 53-bit mantissas, or switch to the extended
> precision
> > version. But I do not think we should leave it as the current
> > implementation which has disadvantages against either of the
> alternatives.
>
> What is your suggestion?
>

Some background...

The dot2s is a specialisation of the DotK algorithm where the K represents
the k-fold increase in precision over a standard scalar product and denotes
qualitatively the level of robustness for a sum with massive cancellation
(e.g. 1e-100 + 1e200 - 2e-100 - 1e200 = -1e-100). Any floating-point sum is
limited by the representation of a 64-bit double as a binary floating point
number with only 53-bits of precision and an exponent.  When adding numbers
the result is computed and any extra bits that cannot fit into the mantissa
are lost. Whether this occurs is dependent on the input numbers, but in
general as the difference in magnitude becomes greater then the result will
contain progressively less of the true result due to the 53-bit mantissa.
If two numbers are different by more than 2^53 then adding them makes no
difference to the bigger number. In the example above a standard precision
result is 0. Only with extended precision can you hold the result of 1e-100
+ 1e200, etc.

In a sum of magnitude this only really matters when cancellation (opposite
signs) may occur. Take the recently discussed code to compute the 3D vector
length:

len = sqrt(x^2 + y^2 + z^2)

This uses a sum where the sign of the terms is the same. There will be no
cancellation and thus the answer will be close to the true result (within a
few ulps).

But if the sum of multiple values has different signs then the answer may
be in between the magnitudes of the terms. So the intermediate sum should
store knowledge of more bits of the intermediate result than the 64-bits of
a double.

The LinearCombination is actually providing two operations, multiplication
and addition:

sum_i {a_i * b_i}

The multiplication has a potential 106-bit mantissa for the result with a
single exponent. Any addition may require a longer mantissa with a single
exponent which is how BigDecimal would store the result. An extended
precision double number would represent the result as an array of two
doubles of very different magnitudes. Adding or multiplying the extended
precision result can increase the length of the result further. The use of
arrays of double to represent a number in extended precision is well
described in Shewchuk (1997): Arbitrary Precision Floating-Point Arithmetic
[1], on which a lot of the work in Numbers-142 is based.

There could be a case for a class to implement this generically:

public class ExtendedDouble {
  public static ExtendedDouble of(double a);
  public ExtendedDouble add(double a);
  public ExtendedDouble add(ExtendedDouble a);
  public ExtendedDouble multiply(double a);
  public ExtendedDouble multiply(ExtendedDouble a);
  public double toDouble();
}

That could be in another module in numbers. Shewchuk does not describe
division but it may be in references he provides. His paper is based upon
developing exact arithmetic for geometry applications so addition and
multiplication are all that are used.

As for LinearCombination then I would suggest that the use case is for any
dot product where cancellation may be expected. If handling cancellation is
more important than speed then using the most accurate method would seem to
be a better choice.

I do recall seeing LinearCombination used in the Geometry code inside a
loop with the result stored as a double on each iteration. This would
result in loss of the extended precision intermediate sum and so could
cause error due to cancellation. So there is a case for going through
Geometry to find use cases for LinearCombination and potentially find
places where changes should be made to compute all the terms in advance and
call LinearCombination once with the two input arrays.


> My impression is that [Geometry] emphasizes accuracy over ultimate
> speed (as opposed to libraries used for real-time rendering, I guess).
>
> However, could it be possible to leave this as a user's decision?
> Quoting from Matt's tutorial:
> ---CUT---
> Typically, users of Commons Geometry will construct a single instance
> of this type for use by multiple objects throughout an entire
> operation, or even application. Since we don't want our class to
> assume such a heavy responsibility, we will simply accept a
> DoublePrecisionContext in the constructor.
> ---CUT---
> Would it be conceivable that the choice of the implementation
> activated by a call to the "LinearCombination" facility is also
> encapsulated in the "DoublePrecisionContext"?
>

Numbers-142 did discuss the change of LinearCombination to an interface.
Then we already have several implementations for the interface where a user
can choose the balance between speed and robustness to cancellation.

It would mean reworking Geometry at all places that use LinearCombination
to use an instance rather than the current static method call. There would
be a default instance or a user supplied one.

I would not opt for using a singleton to define the behaviour of the static
method. This would work for a standalone application using Geometry but
would have issues for any application where different code requires a
different precision level.


[1]
http://www-2.cs.cmu.edu/afs/cs/project/quake/public/papers/robust-arithmetic.ps

Re: [geometry] 1.0-beta2

Posted by Gilles Sadowski <gi...@gmail.com>.
Hi Alex.

Le mar. 20 avr. 2021 à 11:17, Alex Herbert <al...@gmail.com> a écrit :
>
> [...]

I'm a bit lost in all these bits... ;-)

> Any opinions on how changing LinearComination may affect Geometry? Either
> we clean up the implementation to use the fast dot2s algorithm with correct
> support for splitting 53-bit mantissas, or switch to the extended precision
> version. But I do not think we should leave it as the current
> implementation which has disadvantages against either of the alternatives.

What is your suggestion?

My impression is that [Geometry] emphasizes accuracy over ultimate
speed (as opposed to libraries used for real-time rendering, I guess).

However, could it be possible to leave this as a user's decision?
Quoting from Matt's tutorial:
---CUT---
Typically, users of Commons Geometry will construct a single instance
of this type for use by multiple objects throughout an entire
operation, or even application. Since we don't want our class to
assume such a heavy responsibility, we will simply accept a
DoublePrecisionContext in the constructor.
---CUT---
Would it be conceivable that the choice of the implementation
activated by a call to the "LinearCombination" facility is also
encapsulated in the "DoublePrecisionContext"?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [geometry] 1.0-beta2

Posted by Alex Herbert <al...@gmail.com>.
On Mon, 19 Apr 2021 at 22:30, Gilles Sadowski <gi...@gmail.com> wrote:

> Le lun. 19 avr. 2021 à 20:26, Matt Juntunen
>
> > What needs to be done on numbers before we're ready for
> > 1.0 (aside from moving over some code from geometry)?
>
> The most basic utilities haven't fundamentally changed.  It
> will be nice to increase the visibility of the many consistency
> and performance improvements.
> Modules to perhaps be left out are also TBD (in another thread).
>

I did a lot of work investigating improving the accuracy of
LinearCombination (see Numbers 142 [1]). This ended by improving the
accuracy of the linear combination that is used inside the Complex class in
a very special use case summing terms close to zero (i.e. where large
cancellations of terms are significant). However that is private to the
class. The main LinearCombination class remains the same which uses an
approximate quad level accuracy (128-bit) sum of linear terms. This could
be changed to improve accuracy at the expense of runtime speed.

Even if we keep the quad level accuracy the method to split the upper and
lower parts of the number before multiplication can be improved to retain
an extra bit of information and support sub-normal numbers. All the code
for variations and the performance benchmarks are in
'o.a.c.numbers.examples.jmh.arrays'. The main LinearCombinations class in
that package has all variants.

Currently the LinearCombination class is implementing a variant of the
dot2s method of Ogita et al (2005). The method for array inputs uses array
allocation in the computation. It also uses a split of the 53-bit double
mantissa (including the leading 1-bit) to two 26-bit doubles. This can be
changed to the reference dot2s method to avoid array allocation and use
Dekker's split to extract a 27-bit double and a 26-bit double with support
for sub-normal numbers. This would be the implementation in the
LinearCominations.Dot2s class.

Alternatively there is a version which computes an exact summation without
using BigDecimal (LinearCominations.ExtendedPrecision using ExactSum for
the summation). It is slower than the current 2-fold precision method but
much faster than using BigDecimal. Changing to this implementation may have
a big impact on the performance of Geometry which uses LinearCombination
extensively.

The Jira ticket has a lot of performance information. This post [2] seems
to summarise the speed relative to the current implementation as:

Value Method Relative
twoD dot2s 0.73
twoD extended 1.13
threeD dot2s 0.75
threeD extended 1.48
fourD dot2s 0.75
fourD extended 1.68
nD dot2s 0.56
nD extended 2.95

So an exact dot2s implementation is faster and the extended precision
implementation is about 3x slower on long arrays but not much slower on
small combinations. Here the extended precision sum is accurate to 1 ULP.
Making it exact to 0 ULP (extended2) takes a bit more time and the
performance is summarised later as:

"Using the extended2 method versus the fast dot2s will have a performance
impact dependent on the condition number of the dot product and the length
of the dot product. For short lengths (2 to 4) the method is about 1.2 to
2.5x slower. Long dot products are much slower (8x). The runtime is
insignificant compared to using BigDecimal which is a magnitude slower."

Any opinions on how changing LinearComination may affect Geometry? Either
we clean up the implementation to use the fast dot2s algorithm with correct
support for splitting 53-bit mantissas, or switch to the extended precision
version. But I do not think we should leave it as the current
implementation which has disadvantages against either of the alternatives.

Alex


[1] https://issues.apache.org/jira/browse/NUMBERS-142
[2]
https://issues.apache.org/jira/browse/NUMBERS-142?focusedCommentId=17040111&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17040111

Re: [geometry] 1.0-beta2

Posted by Gilles Sadowski <gi...@gmail.com>.
Le lun. 19 avr. 2021 à 20:26, Matt Juntunen
<ma...@hotmail.com> a écrit :
>
> Hi Gilles,
>
> Are you suggesting skipping another beta version and having numbers 1.0 and geometry 1.0 be the next releases?

Yes, that's the question which I'm asking.
There is no point in waiting much longer for feedback that
may never come...
Of course, we aim for the perfect design but if we get it wrong
and must evolve the library in a non-compatible way, all that
will happen is that the base package will change name.

> I could get on board with that. It would be great to have an
> official release of these.

Well, it's up to the main contributor(s) to let us know when it
seems that the design is good enough given the knowledge
shared within the current community.

> What needs to be done on numbers before we're ready for
> 1.0 (aside from moving over some code from geometry)?

The most basic utilities haven't fundamentally changed.  It
will be nice to increase the visibility of the many consistency
and performance improvements.
Modules to perhaps be left out are also TBD (in another thread).

>
> On another note, I don't feel like the enclosing and hull
> modules in geometry are quite ready for prime time yet.
> So, I would leave those out of 1.0.

That would be quite fine, I think.

Gilles

>> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [geometry] 1.0-beta2

Posted by Matt Juntunen <ma...@hotmail.com>.
Hi Gilles,

Are you suggesting skipping another beta version and having numbers 1.0 and geometry 1.0 be the next releases? I could get on board with that. It would be great to have an official release of these. What needs to be done on numbers before we're ready for 1.0 (aside from moving over some code from geometry)?

On another note, I don't feel like the enclosing and hull modules in geometry are quite ready for prime time yet. So, I would leave those out of 1.0.

Regards,
Matt
________________________________
From: Gilles Sadowski <gi...@gmail.com>
Sent: Monday, April 19, 2021 12:37 PM
To: Commons Developers List <de...@commons.apache.org>
Subject: Re: [geometry] 1.0-beta2

Hello Matt.

Le lun. 19 avr. 2021 à 15:20, Matt Juntunen
<ma...@hotmail.com> a écrit :
>
> I'd like to release commons-geometry 1.0-beta2 within the next couple of weeks. I'm planning on including GEOMETRY-118 (additional methods for transform matrix classes) in that if possible. Is there anything else anyone would like to include?

Thanks a lot for your work on [Geometry].

Could we perhaps make progress with the issue of what code can
be moved to [Numbers]?
Rationale is that some project might be unwilling to allow "beta" code
in its dependencies (if just because there is the risk JAR hell, which
we explicitly permit in beta releases).  Since it isn't likely that additional
feedback will come about the beta components, it would be good to
plan for an "official" release of [Numbers] and [Geometry], in that order.
[It's not mandatory to include all modules.]

WDYT?

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [geometry] 1.0-beta2

Posted by Gilles Sadowski <gi...@gmail.com>.
Hello Matt.

Le lun. 19 avr. 2021 à 15:20, Matt Juntunen
<ma...@hotmail.com> a écrit :
>
> I'd like to release commons-geometry 1.0-beta2 within the next couple of weeks. I'm planning on including GEOMETRY-118 (additional methods for transform matrix classes) in that if possible. Is there anything else anyone would like to include?

Thanks a lot for your work on [Geometry].

Could we perhaps make progress with the issue of what code can
be moved to [Numbers]?
Rationale is that some project might be unwilling to allow "beta" code
in its dependencies (if just because there is the risk JAR hell, which
we explicitly permit in beta releases).  Since it isn't likely that additional
feedback will come about the beta components, it would be good to
plan for an "official" release of [Numbers] and [Geometry], in that order.
[It's not mandatory to include all modules.]

WDYT?

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org