You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Luc Maisonobe <Lu...@free.fr> on 2012/08/08 23:19:19 UTC

Re: [math] API proposal for differentiation

Le 24/07/2012 11:51, Anxionnat Julien a écrit :
> 
> 
>>> [...]
>>> I have to deal with functions like that:
>>>
>>>     public double[] value (double t) { ... }
>>>
>>> (The best interface for my subjects would be: public Vector3D value
>> (double t) { ... }, but it's not the point now.)
>>>
>>>
>>>>
>>>> [...]
>>>>
>>>
>>> I'm sorry, I don't understand how it's gonna be for my function :
>>>     value(double t): double[]
>>
>> It would simply be
>>    RallNumber[] value(RallNumber)
> 
> Ok. (I had understood that a RallNumber for vectors was necessary. So I had coded a RallNumber class with double[] value and double[] firstDerivative.)
> 
> 
>>
>>>
>>> I just tried to code the interfaces (UnivariateVectorDifferentiable,
>> UnivariateVectorDerivative, DifferentialPair,
>> UnivariateVectorDifferentiator) and the TwoPointsScheme differentiator.
>>
>> We already have all of this in Nabla (up to 8 points scheme and also
>> algorithmic forward mode directly from bytecode analysis, for simple
>> functions that do not call other functions or store intermediate values
>> in fields), so don't worry about this.
>>
> 
> I didn't see the interfaces for vector functions, so I coded them in local (just to help me to understand):
> - UnivariateVectorDifferentiable:
>     double[] f(double t)
> - UnivariateVectorDerivative:
>     UnivariateVectorDifferentiable getPrimitive()
>     RallNumber[] f(RallNumber t) 
> - UnivariateVectorDifferentiator:
>     UnivariateVectorDerivative differentiate(UnivariateVectorDifferentiable d)
> 
> and a TwoPointsScheme differentiator for vector functions.
> 
> Now, everything it's ok for me.
> 
> I've just one question :
> - Should we have a specific abstract class for finite differences differentiator for vector functions (called FiniteDifferencesVectorDifferentiator) ?
> - OR prefer to implement the UnivariateVectorDifferentiator in the FiniteDifferencesDifferentiator ? and have to code the differentiate method for vectors for the two to eight points schemes (which interest us) ?
> 
> 
>>>
>>> If I get your purpose, DifferentialPair has to contain:
>>>     - "zero-order": x, f(x)
>>>     - first order: f(x), f'(x)
>>>     - second order: f'(x), f''(x)
>>>     - etc.
>>
>> No.
>> If we want to go already to more than 1st order, then we need to use an
>> extension of Rall numbers. See for example the paper "Doubly Recursive
>> Multivariate Automatic Differentiation" by Dan Kalman in 2002
>> (<http://www1.math.american.edu/People/kalman/pdffiles/mmgautodiff.pdf>
>> )
>>
> 
> I'll check this, thanks.

I have implemented a first version in line with Dan Kalman paper. It was
surprisingly difficult, mainly in order to avoid the very large number
of recursive calls in multiplication and function composition. I ended
up with setting a "compiler" class that analyzes only once the recursive
structure (and in fact already in an iterative way, not recursively) and
that creates indirection arrays that will be used for the actual
computation.

For example, when you compute f*g, the "compiler" has already identified
that the formula for the third derivative of d3(f*g)//dxdy^2  is f *
d3g/dxdy2 + 2 * df/dy * d2g/dxdy + d2f/dy2 * dg/dx + df/dx * d2g/dy2 + 2
* d2f/dxdy * dg/dy + d3f/dxdy2 * g". So this computation will only
involve the number of multiplications and additions in this formula and
will pick up the partical derivatives for the arguments automatically.
Without this compilation phase, you would end up with a tremendous
amount of duplicate computations. This is due to the fact recursive
calls do not identify that in the sum a * (b * c) + b * (a * c) + c * (b
* a) + c * (a * b) + ... all the terms are equal and this could be
rewritten n * a * b * c. The compiler identifies this so the computation
engine can use it. This is mainly the Leibniz rule and as the binomial
coefficients grow quite quickly with derivation orders, the gain in
memory, computation and accuracy is important. In fact, before I set up
this identification scheme, I encountered memory exhaustion errors in
the stress tests I used (with derivation orders about 8 or 10 and
perhaps a dozen independent variables).

One of the very interesting property of the Dan Kalman method is that
the same API can be used for either one free parameter and one
derivation order and for several free parameters and higher derivation
orders. This is especially important when a function is implemented as a
program with several independant functions. A typical case is when a
function f depends on several parameters x, y z and its implementation
computes an intermediate variable u which also depends on x, y and z.
Then f calls g(u). Out of context, we would say that g is a univariate
variable and we do not need to preserve lots of data to get its
derivative. But as u depends on several variables, it is really simpler
to have u know itself about its dependencies, and preserve the three
partial derivatives with respect to x, y and z. So when we get the
result g(u), this results automatically knows about dg/dx, dg/dy, dg/dz
(and even d5g/dx2dydz2 if we want to go to 5th order derivative).
Without this multi-parameters/multi-order feature, we would be forced to
do some post processing after the call to g and compute ourselves dg/dx,
dg/dy, dg/dz (and even d5g/dx2dydz2) from dg/du, d2g/du2 ... d5g/du5. Of
course, this is what is done under the hood, but it is done once in the
core classes which represent u and which are provided by [math], the
user is not required to do it by himself.

This new differentiation package is far from being complete. There are
missing functions (the inverse trigonometric functions asin, acos, atan,
atan2) and missing features (Taylor expansion). There are also not yet
any automatic differentiator available. However, the core elements are
available for review, and users can implement directly differentiable
functions to any order and with any number of parameters.

One last comment: do not try to push this too far. Due to Leibniz rule,
things go weird if you use both medium order and medium number of
parameters. For example first derivative for 20 parameters involves only
20 elements, 10th derivative for 1 parameter involves only 10 elements,
but 10th derivative for 20 parameters involves 184756 elements and
multiplication is a combination of these elements, you quickly end up
with millions of operations ...

best regards,
Luc

> 
> Julien
> 
> 
>> We could do this already from the beginning, i.e. our RallNumber class
>> would be built with the order and the number of free variables.
>> However,
>> it would be user responsibility to ensure consistency between free
>> variables x and y in a function like f(x, y). User should make sure
>> that
>> if x is associated with index 0 in one variable, the y is associated
>> with a different index, and in both cases the derivation orders are
>> equal.
>>
>>
>>> => What happens for the zero order (x is a double, not an array of
>> double) ?
>>
>> Then x is just one variable.
>>
>> Luc
>>
>>>
>>>
>>> Regards,
>>> Julien
>>>
>>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org