You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@commons.apache.org by Christoph Höger <ch...@tu-berlin.de> on 2013/05/22 13:52:42 UTC

[math] Total derivative of a MultivariateVectorFunction

Dear all,

I am currently working with the DerivativeStructure-based AD framework
integrated into math 3.2.

Calculating the n-th order partial derivatives works fine, but I am
facing some trouble calculating the n-th order total derivative of a
MultivariateVectorFunction.

My Idea was to have a class Derivative (implementing
MultivariateVectorFunction) that delegates to its base
MultivariateVectorFunction and returns the first order total derivative
of that function. Chaining many such classes should create the n-th
order total derivative (direct creation would be an optimization).

The class looks like this:

public final class Derivative implements
MultivariateDifferentiableFunction {

    private final MultivariateDifferentiableFunction base;

    @Override
    public double value(double[] point) {
        final int dim = 1 + point.length / 2;

        final DerivativeStructure[] dpoint = new DerivativeStructure[dim];
        for (int i = 0; i < dim; i++)
            dpoint[i] = new DerivativeStructure(dim, 1, i, point[i]);

        final double[] dvalue = base.value(dpoint).getAllDerivatives();

        double ret = dvalue[0]; // 𝛿base/𝛿t

        for (int i = 1; i < dvalue.length; i++)
            ret = dvalue[i] * point[i + dim]; // 𝛿base/𝛿point[i] *
                                              // dpoint[i]/dt

        return ret;
    }

    @Override
    public DerivativeStructure value(DerivativeStructure[] point)
            throws MathIllegalArgumentException {
       //??
        return null;
    }

}

As you can see, the Derivative takes _more_ parameters than the base
function. Those additional parameters are the total derivatives of the
original parameters. The first parameter is the independent variable.

This function is highly regular, and I can probably just calculate the
partial derivatives directly:

The partial derivative of a parameter from 1 to (dim-1) is the second
order partial derivative of that parameter in base. The partial
derivative of a parameter from dim upwards is the corresponding partial
derivative of base.


My problem is: What shall I do with the DerivativeStructure[] point I am
handed from the outside? How to get them into the equation?


-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [math] Total derivative of a MultivariateVectorFunction

Posted by Christoph Höger <ch...@tu-berlin.de>.

For the records: Please exchange MultivariateVectorFunction with
MultivariateDifferentiableFunction in my mail below.

Am 22.05.2013 13:52, schrieb Christoph Höger:
> Dear all,
> 
> I am currently working with the DerivativeStructure-based AD framework
> integrated into math 3.2.
> 
> Calculating the n-th order partial derivatives works fine, but I am
> facing some trouble calculating the n-th order total derivative of a
> MultivariateVectorFunction.
> 
> My Idea was to have a class Derivative (implementing
> MultivariateVectorFunction) that delegates to its base
> MultivariateVectorFunction and returns the first order total derivative
> of that function. Chaining many such classes should create the n-th
> order total derivative (direct creation would be an optimization).
> 
> The class looks like this:
> 
> public final class Derivative implements
> MultivariateDifferentiableFunction {
> 
>     private final MultivariateDifferentiableFunction base;
> 
>     @Override
>     public double value(double[] point) {
>         final int dim = 1 + point.length / 2;
> 
>         final DerivativeStructure[] dpoint = new DerivativeStructure[dim];
>         for (int i = 0; i < dim; i++)
>             dpoint[i] = new DerivativeStructure(dim, 1, i, point[i]);
> 
>         final double[] dvalue = base.value(dpoint).getAllDerivatives();
> 
>         double ret = dvalue[0]; // 𝛿base/𝛿t
> 
>         for (int i = 1; i < dvalue.length; i++)
>             ret = dvalue[i] * point[i + dim]; // 𝛿base/𝛿point[i] *
>                                               // dpoint[i]/dt
> 
>         return ret;
>     }
> 
>     @Override
>     public DerivativeStructure value(DerivativeStructure[] point)
>             throws MathIllegalArgumentException {
>        //??
>         return null;
>     }
> 
> }
> 
> As you can see, the Derivative takes _more_ parameters than the base
> function. Those additional parameters are the total derivatives of the
> original parameters. The first parameter is the independent variable.
> 
> This function is highly regular, and I can probably just calculate the
> partial derivatives directly:
> 
> The partial derivative of a parameter from 1 to (dim-1) is the second
> order partial derivative of that parameter in base. The partial
> derivative of a parameter from dim upwards is the corresponding partial
> derivative of base.
> 
> 
> My problem is: What shall I do with the DerivativeStructure[] point I am
> handed from the outside? How to get them into the equation?
> 
> 


-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [math] Total derivative of a MultivariateVectorFunction

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 25/05/2013 21:27, Luc Maisonobe a écrit :
> Hi Christoph,
> 
> Le 23/05/2013 10:39, Christoph Höger a écrit :
>> Am 22.05.2013 22:50, schrieb Luc Maisonobe:
>>
>>> I am not sure I understood your use case properly. I'll look at it further in the next few days.
>>>
>>> A first very quick answer is that the interface as it is defined does not seem to correspond to your needs.
>>> In the current interface, the meaning of the two "value" method is the same. So the various elements in
>>> The point array should be the same. In your case, I think you already use the array to represent derivatives
>>> of the first element, so I think you expanded the content of what should be a single DerivativeStructure instance as a double array.
>>>
>>> Are you sure you should not use a univariate function ? The DerivativeStructure argument you would get
>>> would contain all the partial derivatives already.
>>>
>>> Once again, I'm not sure I understood your example properly as I did not find the time to think about it for now.
>>>
>>> Best regards,
>>> Luc
>>
>> Hi Luc,
>>
>> I am trying to motivate the problem:
>>
>> Consider the simple pendulum equation
>>
>> x² + y² - L² = 0
>>
>> I could use DerivativeStructure to solve for that equation by using e.g.
>> NewtonRaphson.
>>
>> But in a model, x and y are actually depending on the free variable t,
>> so I may require the total derivative of the above equation, e.g.:
>>
>> 2*x*dx + 2*y*dy = 0
>>
>> Again, I want to be able to solve for that equation by using an
>> iterative method. The thing is: This total derivative has now more
>> parameters than the original equation (namely dx and dy).
>>
>> If I model it that way, I can pass the x and y parameters to the base
>> function, evaluate the partial derivatives (2x and 2y) and multiply them
>> with the total derivatives dx and dy. The problem here is that the
>> base-equations partial derivatives are not constant (the second order
>> partial derivatives are, though). So I somehow need to reflect that for
>> the numerical solver. That's why I thought, I should make the derivative
>> also a MultivariateDifferentialFunction.
> 
> The differentiation framework does not mandate that the number of
> arguments of the Java method implementing the mathematical functions are
> equal. One parameter (say x for example) can be a function of another
> parameter (say t), or even a function of n different parameters x =
> f(p1, p2, p3, ..., pn). You do not even need to know the number of free
> variables when you use x, and the same code can be reused with x  being
> the free variable, x, being a function of one free variable t or x being
> a function of n free variables p1, ... pn. In all cases, whan you
> compute x * x the appropriate number of derivatives will be computed for
> you.
> 
> So I would suggest the following in your case: implement in the most
> straightforward way your function using a few intermediate functions.
> Let's say for example that x = a cos(t) and y = b sin(t) (i.e. you are
> working on an ellipse):
> 
> 
>       public DerivativeStructure x(DerivativeStructure t) {
>         return t.cos().multiply(a);
>       }
> 
>       public DerivativeStructure y(DerivativeStructure t) {
>         return t.cos().multiply(b);
>       }
> 
>       public DerivativeStructure f(DerivativeStructure x,
>                                    DerivativeStructure y) {
>         return x.multiply(x).add(y.multiply(y)).add(l * l);

Sorry, in the line above, replace .add(l * l) with .subtract(l * l),
otherwise the only solution would be x = 0, y = 0, and only of l is also
= 0 ...

>       }
> 
> When you want to solve f with respect to t, you would simply do
> 
>  UnivariateDifferentiableSolver solver =
>    new NewtonRaphsonSolver(absoluteAccuracy);
> 
>  UnivariateDifferentiableFunction func =
>     new UnivariateDifferentiableFunction() {
> 
>       public double value(double t) {
>         // not really used
>         return value(new DerivativeStructure(1, 0, t)).getValue();
>       }
> 
>       public DerivativeStructure value(DerivativeStructure t) {
>         return f(x(t), y(t));
>       }
> 
>     };
>  double tRoot = solver.solve(maxEval, func, min, max);
> 
> If on the other hand you need to optimize something using x and y
> as independent free variables, then you would set up a multivariate
> function as:
> 
>  MultivariateDifferentiableFunction func =
>     new MultivariateDifferentiableFunction() {
> 
>       public double value(double[] p) {
>         // not really used
>         return value(new DerivativeStructure[] {
>                        new DerivativeStructure(1, 0, p[0]),
>                        new DerivativeStructure(1, 0, p[1])
>                      }).getValue();
>       }
> 
>       public DerivativeStructure value(DerivativeStructure[] p) {
>         return f(p[0], p[1]);
>       }
> 
>     };
> 
> 
> There are two things to note.
> 
> The first one is that in both cases, you can reuse the same
> implementation of the function f. When called in the first case, it will
> be called with DerivativeStructure instances that will have one free
> variable (which is t) and differentiation order 1, so it will return a
> DerivativeStructure with the same structure and which will therefore
> contain two values: f and df/dt. When called in the second case, it will
> be called with DerivativeStructure instances that will have two free
> variables (which are x and y) and differentiation order 1, so it will
> return a DerivativeStructure with the same structure and which will
> therefore contain three values: f,  df/dx and df/dy. The f function
> adapts itself to the structure of its arguments and nothing in the code
> of the function knows about the number of free variables or order of
> derivation. In fact, if you call f with x and y each having hundreds of
> partial derivatives, then the result would have the same number of
> partial derivatives.
> 
> The second thing to note is a consequence of the first one: you don't
> explicitely write the total derivative 2*x*dx + 2*y*dy, it is implicitly
> computed directly by the framework.
> 
> Does this make sense to you?
> 
> Luc
> 
>>
>>
>>
>>
>>
>>
>>
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [math] Total derivative of a MultivariateVectorFunction

Posted by Luc Maisonobe <Lu...@free.fr>.

Hi Christoph,

Le 23/05/2013 10:39, Christoph Höger a écrit :
> Am 22.05.2013 22:50, schrieb Luc Maisonobe:
> 
>> I am not sure I understood your use case properly. I'll look at it further in the next few days.
>>
>> A first very quick answer is that the interface as it is defined does not seem to correspond to your needs.
>> In the current interface, the meaning of the two "value" method is the same. So the various elements in
>> The point array should be the same. In your case, I think you already use the array to represent derivatives
>> of the first element, so I think you expanded the content of what should be a single DerivativeStructure instance as a double array.
>>
>> Are you sure you should not use a univariate function ? The DerivativeStructure argument you would get
>> would contain all the partial derivatives already.
>>
>> Once again, I'm not sure I understood your example properly as I did not find the time to think about it for now.
>>
>> Best regards,
>> Luc
> 
> Hi Luc,
> 
> I am trying to motivate the problem:
> 
> Consider the simple pendulum equation
> 
> x² + y² - L² = 0
> 
> I could use DerivativeStructure to solve for that equation by using e.g.
> NewtonRaphson.
> 
> But in a model, x and y are actually depending on the free variable t,
> so I may require the total derivative of the above equation, e.g.:
> 
> 2*x*dx + 2*y*dy = 0
> 
> Again, I want to be able to solve for that equation by using an
> iterative method. The thing is: This total derivative has now more
> parameters than the original equation (namely dx and dy).
> 
> If I model it that way, I can pass the x and y parameters to the base
> function, evaluate the partial derivatives (2x and 2y) and multiply them
> with the total derivatives dx and dy. The problem here is that the
> base-equations partial derivatives are not constant (the second order
> partial derivatives are, though). So I somehow need to reflect that for
> the numerical solver. That's why I thought, I should make the derivative
> also a MultivariateDifferentialFunction.

The differentiation framework does not mandate that the number of
arguments of the Java method implementing the mathematical functions are
equal. One parameter (say x for example) can be a function of another
parameter (say t), or even a function of n different parameters x =
f(p1, p2, p3, ..., pn). You do not even need to know the number of free
variables when you use x, and the same code can be reused with x  being
the free variable, x, being a function of one free variable t or x being
a function of n free variables p1, ... pn. In all cases, whan you
compute x * x the appropriate number of derivatives will be computed for
you.

So I would suggest the following in your case: implement in the most
straightforward way your function using a few intermediate functions.
Let's say for example that x = a cos(t) and y = b sin(t) (i.e. you are
working on an ellipse):


      public DerivativeStructure x(DerivativeStructure t) {
        return t.cos().multiply(a);
      }

      public DerivativeStructure y(DerivativeStructure t) {
        return t.cos().multiply(b);
      }

      public DerivativeStructure f(DerivativeStructure x,
                                   DerivativeStructure y) {
        return x.multiply(x).add(y.multiply(y)).add(l * l);
      }

When you want to solve f with respect to t, you would simply do

 UnivariateDifferentiableSolver solver =
   new NewtonRaphsonSolver(absoluteAccuracy);

 UnivariateDifferentiableFunction func =
    new UnivariateDifferentiableFunction() {

      public double value(double t) {
        // not really used
        return value(new DerivativeStructure(1, 0, t)).getValue();
      }

      public DerivativeStructure value(DerivativeStructure t) {
        return f(x(t), y(t));
      }

    };
 double tRoot = solver.solve(maxEval, func, min, max);

If on the other hand you need to optimize something using x and y
as independent free variables, then you would set up a multivariate
function as:

 MultivariateDifferentiableFunction func =
    new MultivariateDifferentiableFunction() {

      public double value(double[] p) {
        // not really used
        return value(new DerivativeStructure[] {
                       new DerivativeStructure(1, 0, p[0]),
                       new DerivativeStructure(1, 0, p[1])
                     }).getValue();
      }

      public DerivativeStructure value(DerivativeStructure[] p) {
        return f(p[0], p[1]);
      }

    };


There are two things to note.

The first one is that in both cases, you can reuse the same
implementation of the function f. When called in the first case, it will
be called with DerivativeStructure instances that will have one free
variable (which is t) and differentiation order 1, so it will return a
DerivativeStructure with the same structure and which will therefore
contain two values: f and df/dt. When called in the second case, it will
be called with DerivativeStructure instances that will have two free
variables (which are x and y) and differentiation order 1, so it will
return a DerivativeStructure with the same structure and which will
therefore contain three values: f,  df/dx and df/dy. The f function
adapts itself to the structure of its arguments and nothing in the code
of the function knows about the number of free variables or order of
derivation. In fact, if you call f with x and y each having hundreds of
partial derivatives, then the result would have the same number of
partial derivatives.

The second thing to note is a consequence of the first one: you don't
explicitely write the total derivative 2*x*dx + 2*y*dy, it is implicitly
computed directly by the framework.

Does this make sense to you?

Luc

> 
> 
> 
> 
> 
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [math] Total derivative of a MultivariateVectorFunction

Posted by Christoph Höger <ch...@tu-berlin.de>.

Am 22.05.2013 22:50, schrieb Luc Maisonobe:

> I am not sure I understood your use case properly. I'll look at it further in the next few days.
> 
> A first very quick answer is that the interface as it is defined does not seem to correspond to your needs.
> In the current interface, the meaning of the two "value" method is the same. So the various elements in
> The point array should be the same. In your case, I think you already use the array to represent derivatives
> of the first element, so I think you expanded the content of what should be a single DerivativeStructure instance as a double array.
> 
> Are you sure you should not use a univariate function ? The DerivativeStructure argument you would get
> would contain all the partial derivatives already.
> 
> Once again, I'm not sure I understood your example properly as I did not find the time to think about it for now.
> 
> Best regards,
> Luc

Hi Luc,

I am trying to motivate the problem:

Consider the simple pendulum equation

x² + y² - L² = 0

I could use DerivativeStructure to solve for that equation by using e.g.
NewtonRaphson.

But in a model, x and y are actually depending on the free variable t,
so I may require the total derivative of the above equation, e.g.:

2*x*dx + 2*y*dy = 0

Again, I want to be able to solve for that equation by using an
iterative method. The thing is: This total derivative has now more
parameters than the original equation (namely dx and dy).

If I model it that way, I can pass the x and y parameters to the base
function, evaluate the partial derivatives (2x and 2y) and multiply them
with the total derivatives dx and dy. The problem here is that the
base-equations partial derivatives are not constant (the second order
partial derivatives are, though). So I somehow need to reflect that for
the numerical solver. That's why I thought, I should make the derivative
also a MultivariateDifferentialFunction.









-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [math] Total derivative of a MultivariateVectorFunction

Posted by Luc Maisonobe <lu...@spaceroots.org>.



"Christoph Höger" <ch...@tu-berlin.de> a écrit :

>Dear all,

Hi Christoph,

>
>I am currently working with the DerivativeStructure-based AD framework
>integrated into math 3.2.
>
>Calculating the n-th order partial derivatives works fine, but I am
>facing some trouble calculating the n-th order total derivative of a
>MultivariateVectorFunction.
>
>My Idea was to have a class Derivative (implementing
>MultivariateVectorFunction) that delegates to its base
>MultivariateVectorFunction and returns the first order total derivative
>of that function. Chaining many such classes should create the n-th
>order total derivative (direct creation would be an optimization).
>
>The class looks like this:
>
>public final class Derivative implements
>MultivariateDifferentiableFunction {
>
>    private final MultivariateDifferentiableFunction base;
>
>    @Override
>    public double value(double[] point) {
>        final int dim = 1 + point.length / 2;
>
>     final DerivativeStructure[] dpoint = new DerivativeStructure[dim];
>        for (int i = 0; i < dim; i++)
>            dpoint[i] = new DerivativeStructure(dim, 1, i, point[i]);
>
>        final double[] dvalue = base.value(dpoint).getAllDerivatives();
>
>        double ret = dvalue[0]; // 𝛿base/𝛿t
>
>        for (int i = 1; i < dvalue.length; i++)
>            ret = dvalue[i] * point[i + dim]; // 𝛿base/𝛿point[i] *
>                                              // dpoint[i]/dt
>
>        return ret;
>    }
>
>    @Override
>    public DerivativeStructure value(DerivativeStructure[] point)
>            throws MathIllegalArgumentException {
>       //??
>        return null;
>    }
>
>}
>
>As you can see, the Derivative takes _more_ parameters than the base
>function. Those additional parameters are the total derivatives of the
>original parameters. The first parameter is the independent variable.
>
>This function is highly regular, and I can probably just calculate the
>partial derivatives directly:
>
>The partial derivative of a parameter from 1 to (dim-1) is the second
>order partial derivative of that parameter in base. The partial
>derivative of a parameter from dim upwards is the corresponding partial
>derivative of base.
>
>
>My problem is: What shall I do with the DerivativeStructure[] point I
>am
>handed from the outside? How to get them into the equation?

I am not sure I understood your use case properly. I'll look at it further in the next few days.

A first very quick answer is that the interface as it is defined does not seem to correspond to your needs.
In the current interface, the meaning of the two "value" method is the same. So the various elements in
The point array should be the same. In your case, I think you already use the array to represent derivatives
of the first element, so I think you expanded the content of what should be a single DerivativeStructure instance as a double array.

Are you sure you should not use a univariate function ? The DerivativeStructure argument you would get
would contain all the partial derivatives already.

Once again, I'm not sure I understood your example properly as I did not find the time to think about it for now.

Best regards,
Luc

-- 
Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org