You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@commons.apache.org by lu...@apache.org on 2012/09/28 21:12:53 UTC
svn commit: r1391605 - in /commons/sandbox/nabla/trunk/src/site/xdoc:
internals.xml singularities.xml
Author: luc
Date: Fri Sep 28 19:12:53 2012
New Revision: 1391605
URL: http://svn.apache.org/viewvc?rev=1391605&view=rev
Log:
Updated doc.
Modified:
commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml
Modified: commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml?rev=1391605&r1=1391604&r2=1391605&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml Fri Sep 28 19:12:53 2012
@@ -29,13 +29,13 @@
<p>
Nabla computes the derivatives by applying the classical
differentiation rules at bytecode level. When an instance
- of a class implementing <code>UnivariateDifferentiable</code>
+ of a class implementing <code>UnivariateFunction</code>
is passed to its <code>differentiate</code> method, Nabla
tracks the mathematical operations flow that leads from the
<code>t</code> parameter to the return value of the
function. At the bytecode instructions level, all operations
are elementary ones. Each elementary operation is then
- changed to compute both a value and a derivative. Nothing is
+ changed to compute both the value and the derivatives. Nothing is
changed to the control flow instructions (loops, branches,
operations scheduling).
</p>
@@ -45,10 +45,10 @@
the core API and the tree API from the <a href="http://asm.objectweb.org/">asm</a>
bytecode manipulation and analysis framework. The entry point of this differentiation
process is the <a
- href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/MethodDifferentiator.html#visitEnd()">
- visitEnd</a> method of the <code>MethodDifferentiator</code>
- which is called from the <a href="apidocs/org/apache/commons/nabla/algorithmic/forward/ForwardAlgorithmicDifferentiator.html">
- ForwardAlgorithmicDifferentiator</a> class for processing the <code>f</code>
+ href="apidocs/org/apache/commons/nabla/forward/analysis/MethodDifferentiator.html#differentiate()">
+ differentiate</a> method of the <code>MethodDifferentiator</code>
+ which is called from the <a href="apidocs/org/apache/commons/nabla/forward/ForwardModeDifferentiator.html">
+ ForwardModeDifferentiator</a> class for processing the <code>f</code>
method of the user class.
</p>
@@ -58,28 +58,35 @@
operations (addition, subtraction ...), conversion operations
(double to int, long to double ...), storage instructions (local
variables, functions parameters, instance or class fields ...)
- and calls to elementary functions defined in the <code>Math</code>
- and <code>StrictMath</code> classes. There is really nothing more!
- For each one of these basic bytecode instructions, we know how to
- map it to a mathematical equation and we can combine this equation
- with its derivative to form a pair of equations we will use later.
+ and calls to elementary functions defined in the <code>Math</code>,
+ <code>StrictMath</code>, <code>FastMath</code> or similar classes.
+ There is really nothing more! For each one of these basic bytecode
+ instructions, Nabla knows how to map them to a mathematical equation and
+ how to hand these equations to a class that will compute derivatives.
</p>
<p>
- Lets consider the <code>DADD</code> bytecode instruction. It corresponds
+ Lets consider the <code>DADD</code> bytecode instruction and consider
+ only first derivative for now. This instruction corresponds
to the addition of two real numbers and produces a third number
- which is their sum. We map the instruction to the equation:
+ which is their sum. Nabla maps the instruction to the equation:
<pre><code>c=a+b</code></pre>
- and we combine this with its derivative to form the pair:
+ and calls the <a
+ href="http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/differentiation/DerivativeStructure.html">
+ DerivativeStructure</a> class provided by <a
+ href="http://commons.apache.org/math/">Apache Commons Math</a> to
+ compute both the value and the first derivative:
<pre><code>(c=a+b, c'=a'+b')</code></pre>
- In this example, we have simply used the linearity property of
- differentiation which implies that the derivative of a sum is the
- sum of the derivatives. Similar rules exist for all arithmetic
- instructions, and the derivative of all basic functions in the
- <code>Math</code> and <code>StrictMath</code> is known. The
- complete rules set is described in the <a
- href="#Differentiation_rules">Differentiation rules</a> section
- below.
+ In this example, the <code>DerivativeStructure</code> class uses only
+ the linearity property of differentiation which implies that the
+ derivative of a sum is the sum of the derivatives. Similar rules exist
+ for all arithmetic instructions. The derivative of all basic functions
+ in the <code>Math</code> , <code>StrictMath</code> and <code>FastMath</code>
+ are known. The rules are also known for any derivation order, they are
+ not limited to first order. In fact, Nabla itself does not know any
+ of these rules, all computations are delegated to <a
+ href="http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/differentiation/DerivativeStructure.html">
+ DerivativeStructure</a>.
</p>
<p>
@@ -87,8 +94,8 @@
</p>
<source>
- public class Linear implements UnivariateDifferentiable {
- public double f(double t) {
+ public class Linear implements UnivariateFunction {
+ public double value(double t) {
double result = 1;
for (int i = 0; i < 3; ++i) {
result = result * (2 * t + 1);
@@ -100,10 +107,10 @@
<p>
In this example, the only things that need to be changed for
- differentiating the <code>f</code> method are
+ differentiating the <code>value</code> method are
the <code>t</code> parameter and the <code>result</code>
- local variable which must be adapted to hold the value and
- the derivative, the two multiplications and the addition. In
+ local variable which must be adapted to hold the derivative
+ structure, the two multiplications and the addition. In
order to generate the new function, Nabla will convert these
elements one at a time, starting from the parameter change
and propagating the change using a simple data flow
@@ -116,38 +123,32 @@
</p>
<source>
- public DifferentialPair f(DifferentialPair t) {
+ public DerivativeStructure value(DerivativeStructure t) {
- // source roughly equivalent to code generated at method entry
- double t0 = t.getValue();
- double t1 = t.getFirstDerivative();
-
- // source roughly equivalent to conversion of the result affectation
- double result0 = 1;
- double result1 = 0;
+ // source roughly equivalent to conversion of the result initialization
+ DerivativeStructure result = new DerivativeStructure(t.getFreeParameters(),
+ t.getOrder(),
+ 1.0);
// this loop handling code is not changed at all
for (int i = 0; i < 3; ++i) {
// source roughly equivalent to conversion of "2 * ..."
- double tmpA0 = 2 * t0;
- double tmpA1 = 2 * t1;
+ DerivativeStructure tmpA = t.multiply(2.0);
// source roughly equivalent to conversion of "... + 1"
- tmpA0 += 1;
+ DerivativeStructure tmpB = tmpA.add(1.0);
// source roughly equivalent to conversion of "result * ..."
- double tmpB0 = result0 * tmpA0;
- double tmpB1 = result0 * tmpA1 + result1 * tmpA0;
+ DerivativeStructure tmpC = result.multiply(tmpB);
// source roughly equivalent to conversion of "result = ..."
- result0 = tmpB0;
- result1 = tmpB1;
+ result = tmpC;
}
// source equivalent to code generated at method exit
- return new DifferentialPair(result0, result1);
+ return result;
}
</source>
@@ -159,38 +160,29 @@
</subsection>
- <subsection name="No intermediate DifferentialPair instances">
+ <subsection name="double converted to DerivativeStructure">
<p>
Another thing that is shown in the previous example is that
- <code>DifferentialPair</code> instances appear only at method
- entry and exit. They are not used internally for elementary
- conversions, only pairs of local variables (or operand stack
- cells as we will see later) are used.
+ <code>DerivativeStructure</code> instances appear everywhere
+ a double that depends on the input parameter appears, like the
+ result variable and all the temporary variables. However,
+ double that do not depend on the input parameter like the
+ 2.0 and 1.0 literal constants remain primitive double values.
</p>
- <p>
- For differentiable functions that call other differentiable functions,
- additional instances of <code>DifferentialPair</code> will be used.
- Some will be built in the caller to pass parameters to the callee, and
- others will be build in the callee to return results to the caller.
- This is <em>not</em> the case for calls to the elementary mathematical
- functions defined in the <code>Math</code> and <code>StrictMath</code>
- classes. For these known functions the derivatives computations are inlined.
- For example a call to the <code>Math.cos</code> function will be inlined
- as a call to <code>Math.cos</code>, a call to <code>Math.sin</code>
- and some intermediate arithmetic operations.
- </p>
+ </subsection>
+
+ <subsection name="no dependency on differentiation order or number of free parameters">
<p>
- The only methods from the <code>DifferentialPair</code> class that
- are used by the algorithmic differentiator are the two arguments
- constructor, the <code>getValue</code> method and the
- <code>getFirstDerivative</code> method. As far as algorithmic
- differentiation is concerned, all the other methods could be removed
- from the class. In fact, they are only present for user convenience
- if they wish to perform additional computation after a function has
- been differentiated.
+ The code above also shows that the generated code does not depend
+ on the derivation order or number or free parameters. In fact, this
+ information is only carried at runtime by the <code>DerivativeStructure</code>
+ instance provided as an unput parameters, and the intermediate instances
+ created on the fly will automatically share these values (see the construction
+ of the <code>result</code> variable and the call to <code>getFreeParameters</code>
+ and <code>getOrder</code>).
</p>
</subsection>
@@ -213,31 +205,27 @@
href="#Virtual_machine_execution_model">virtual machine execution
model</a> section) with the instructions that may produce it and the
instructions that may consume it. This task is realized by the <a
- href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/TrackingInterpreter.html">
+ href="apidocs/org/apache/commons/nabla/forward/analysis/TrackingInterpreter.html">
TrackingInterpreter</a> and <a
- href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/TrackingValue.html">
+ href="apidocs/org/apache/commons/nabla/forward/analysis/TrackingValue.html">
TrackingValue</a> classes.
</p>
<p>
- As explained in the <a href="usage.html#Differential_pairs">Differential
- pairs</a> section of the usage documentation, the signature of
- the <code>f</code> method is changed. The primitive <code>double</code>
- <code>t</code> parameter in the original method is changed during
- the differentiation process to a <code>DifferentialPair</code>
- instance in the generated derivative. This instance is itself
- expanded right from the start of the method into a pair of primitive
- <code>double</code> local variables that can be used throughout the
- method code.
+ As explained in the <a href="usage.html#double_converted_to_DerivativeStructure">double
+ converted to DerivativeStructure</a> section of the usage documentation,
+ the signature of the <code>value</code> method is changed. The primitive
+ <code>double</code> <code>t</code> parameter in the original method is changed during
+ the differentiation process to a <code>DerivativeStructure</code>
+ instance in the generated derivative.
</p>
<p>
All instructions that used the original primitive <code>double</code>
- parameter must be changed to cope with the new pair of primitive
- <code>double</code> local variables. In order to do this, the
- representation of the <code>t</code> parameter in the bytecode is
- marked as pending conversion from one primitive <code>double</code>
- to a pair of primitive <code>doubles</code>. Once this data element has
+ parameter must be changed to cope with the new <code>DerivativeStructure</code>
+ local variables. In order to do this, the representation of the <code>t</code>
+ parameter in the bytecode is marked as pending conversion from one primitive
+ <code>double</code> to a <code>DerivativeStructure</code>. Once this data element has
been marked, the data flow will propagate the mark to other data
elements (both variables and operand stack cells) thanks to the following
rules:
@@ -248,7 +236,7 @@
<li>
each primitive <code>double</code> data element produced by a changed
instruction must be marked as pending conversion from one primitive
- <code>double</code> to a pair of primitive <code>doubles</code>
+ <code>double</code> to a <code>DerivativeStructure</code>
</li>
</ul>
These rules propagate the changes for data and instructions throughout the
@@ -267,7 +255,7 @@
<p>
For straightforward smooth functions, the expanded code
really computes both the value of the equation and its exact
- derivative. This is a simple application of the
+ derivatives. This is a simple application of the
differentiation rules. So the accuracy of the derivative
will be in par with the accuracy of the initial function. If
the initial function is a good model of a physical process,
@@ -308,27 +296,12 @@
<section name="Implementation">
- <subsection name="Differential pairs">
- <p><strong>
- TODO: explain that the various methods provided by DifferentialPair
- are convenience methods for end users. They can be used to perform
- some additional transforms on already derived code. BTW, is this feature
- really useful ?
- </strong></p>
- </subsection>
-
<subsection name="Bytecode transforms">
<p><strong>
TODO
</strong></p>
</subsection>
- <subsection name="Differentiation rules">
- <p><strong>
- TODO: put <em>all</em> rules in a table for reference.
- </strong></p>
- </subsection>
-
<subsection name="Complete differentiation example">
<p><strong>
TODO: step by step analysis of the initial example.
Modified: commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml?rev=1391605&r1=1391604&r2=1391605&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml Fri Sep 28 19:12:53 2012
@@ -32,7 +32,7 @@
sub-expressions encountered while evaluating the function.
The function to differentiate may contain conditional or
discontinuous statements like <code>if/then/else</code>
- constructs, calls to the <code>Math.floor()</code> method,
+ constructs, calls to the <code>FastMath.floor()</code> method,
or use of the <code>%</code> operator. From now on, we will
call <em>branching points</em> the values of the
<code>t</code> parameter that trigger changes in these
@@ -85,8 +85,8 @@
</p>
<source>
- UnivariateDifferentiable singular = new UnivariateDifferentiable() {
- public double f(double t) {
+ UnivariateFunction singular = new UnivariateFunction() {
+ public double value(double t) {
if (t < 0) {
return 2 * t;
} else {
@@ -133,8 +133,8 @@
</p>
<source>
- UnivariateDifferentiable nonSingular = new UnivariateDifferentiable() {
- public double f(double t) {
+ UnivariateFunction nonSingular = new UnivariateFunction() {
+ public double value(double t) {
if (t < 0) {
return 2 * t * t;
} else {