You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@commons.apache.org by lu...@apache.org on 2012/09/28 21:12:53 UTC
svn commit: r1391605 - in /commons/sandbox/nabla/trunk/src/site/xdoc: internals.xml singularities.xml

Author: luc
Date: Fri Sep 28 19:12:53 2012
New Revision: 1391605

URL: http://svn.apache.org/viewvc?rev=1391605&view=rev
Log:
Updated doc.

Modified:
    commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
    commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml?rev=1391605&r1=1391604&r2=1391605&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml Fri Sep 28 19:12:53 2012
@@ -29,13 +29,13 @@
         <p>
           Nabla computes the derivatives by applying the classical
           differentiation rules at bytecode level. When an instance
-          of a class implementing <code>UnivariateDifferentiable</code>
+          of a class implementing <code>UnivariateFunction</code>
           is passed to its <code>differentiate</code> method, Nabla
           tracks the mathematical operations flow that leads from the
           <code>t</code> parameter to the return value of the
           function. At the bytecode instructions level, all operations
           are elementary ones. Each elementary operation is then
-          changed to compute both a value and a derivative. Nothing is
+          changed to compute both the value and the derivatives. Nothing is
           changed to the control flow instructions (loops, branches,
           operations scheduling).
         </p>
@@ -45,10 +45,10 @@
           the core API and the tree API from the <a href="http://asm.objectweb.org/">asm</a>
           bytecode manipulation and analysis framework. The entry point of this differentiation
           process is the <a
-          href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/MethodDifferentiator.html#visitEnd()">
-          visitEnd</a> method of the <code>MethodDifferentiator</code>
-          which is called from the <a href="apidocs/org/apache/commons/nabla/algorithmic/forward/ForwardAlgorithmicDifferentiator.html">
-          ForwardAlgorithmicDifferentiator</a> class for processing the <code>f</code>
+          href="apidocs/org/apache/commons/nabla/forward/analysis/MethodDifferentiator.html#differentiate()">
+          differentiate</a> method of the <code>MethodDifferentiator</code>
+          which is called from the <a href="apidocs/org/apache/commons/nabla/forward/ForwardModeDifferentiator.html">
+          ForwardModeDifferentiator</a> class for processing the <code>f</code>
           method of the user class.
         </p>
 
@@ -58,28 +58,35 @@
           operations (addition, subtraction ...), conversion operations
           (double to int, long to double ...), storage instructions (local
           variables, functions parameters, instance or class fields ...)
-          and calls to elementary functions defined in the <code>Math</code>
-          and <code>StrictMath</code> classes. There is really nothing more!
-          For each one of these basic bytecode instructions, we know how to
-          map it to a mathematical equation and we can combine this equation
-          with its derivative to form a pair of equations we will use later.
+          and calls to elementary functions defined in the <code>Math</code>,
+          <code>StrictMath</code>, <code>FastMath</code> or similar classes.
+          There is really nothing more! For each one of these basic bytecode
+          instructions, Nabla knows how to map them to a mathematical equation and
+          how to hand these equations to a class that will compute derivatives.
         </p>
 
         <p>
-          Lets consider the <code>DADD</code> bytecode instruction. It corresponds
+          Lets consider the <code>DADD</code> bytecode instruction and consider
+          only first derivative for now. This instruction corresponds
           to the addition of two real numbers and produces a third number
-          which is their sum. We map the instruction to the equation:
+          which is their sum. Nabla maps the instruction to the equation:
           <pre><code>c=a+b</code></pre>
-          and we combine this with its derivative to form the pair:
+          and calls the <a
+          href="http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/differentiation/DerivativeStructure.html">
+          DerivativeStructure</a> class provided by <a
+          href="http://commons.apache.org/math/">Apache Commons Math</a> to
+          compute both the value and the first derivative:
           <pre><code>(c=a+b, c'=a'+b')</code></pre>
-          In this example, we have simply used the linearity property of
-          differentiation which implies that the derivative of a sum is the
-          sum of the derivatives. Similar rules exist for all arithmetic
-          instructions, and the derivative of all basic functions in the
-          <code>Math</code> and <code>StrictMath</code> is known. The
-          complete rules set is described in the <a
-          href="#Differentiation_rules">Differentiation rules</a> section
-          below.
+          In this example, the <code>DerivativeStructure</code> class uses only
+          the linearity property of differentiation which implies that the
+          derivative of a sum is the sum of the derivatives. Similar rules exist
+          for all arithmetic instructions. The derivative of all basic functions
+          in the <code>Math</code> , <code>StrictMath</code> and <code>FastMath</code>
+          are known. The rules are also known for any derivation order, they are
+          not limited to first order. In fact, Nabla itself does not know any
+          of these rules, all computations are delegated to <a
+          href="http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/differentiation/DerivativeStructure.html">
+          DerivativeStructure</a>.
         </p>
 
         <p>
@@ -87,8 +94,8 @@
         </p>
 
         <source>
-          public class Linear implements UnivariateDifferentiable {
-              public double f(double t) {
+          public class Linear implements UnivariateFunction {
+              public double value(double t) {
                   double result = 1;
                   for (int i = 0; i &lt; 3; ++i) {
                       result = result * (2 * t + 1);
@@ -100,10 +107,10 @@
 
         <p>
           In this example, the only things that need to be changed for
-          differentiating the <code>f</code> method are
+          differentiating the <code>value</code> method are
           the <code>t</code> parameter and the <code>result</code>
-          local variable which must be adapted to hold the value and
-          the derivative, the two multiplications and the addition. In
+          local variable which must be adapted to hold the derivative
+          structure, the two multiplications and the addition. In
           order to generate the new function, Nabla will convert these
           elements one at a time, starting from the parameter change
           and propagating the change using a simple data flow
@@ -116,38 +123,32 @@
         </p>
 
         <source>
-          public DifferentialPair f(DifferentialPair t) {
+          public DerivativeStructure value(DerivativeStructure t) {
 
-              // source roughly equivalent to code generated at method entry
-              double t0 = t.getValue();
-              double t1 = t.getFirstDerivative();
-
-              // source roughly equivalent to conversion of the result affectation
-              double result0  = 1;
-              double result1  = 0;
+              // source roughly equivalent to conversion of the result initialization
+              DerivativeStructure result  = new DerivativeStructure(t.getFreeParameters(),
+                                                                    t.getOrder(),
+                                                                    1.0);
 
               // this loop handling code is not changed at all
               for (int i = 0; i &lt; 3; ++i) {
 
                   // source roughly equivalent to conversion of "2 * ..."
-                  double tmpA0 = 2 * t0;
-                  double tmpA1 = 2 * t1;
+                  DerivativeStructure tmpA = t.multiply(2.0);
 
                   // source roughly equivalent to conversion of "... + 1"
-                  tmpA0 += 1;
+                  DerivativeStructure tmpB = tmpA.add(1.0);
 
                   // source roughly equivalent to conversion of "result * ..."
-                  double tmpB0 = result0 * tmpA0;
-                  double tmpB1 = result0 * tmpA1 + result1 * tmpA0;
+                  DerivativeStructure tmpC = result.multiply(tmpB);
 
                   // source roughly equivalent to conversion of "result = ..."
-                  result0 = tmpB0;
-                  result1 = tmpB1;
+                  result = tmpC;
 
               }
 
               // source equivalent to code generated at method exit
-              return new DifferentialPair(result0, result1);
+              return result;
 
           }
         </source>
@@ -159,38 +160,29 @@
 
       </subsection>
 
-      <subsection name="No intermediate DifferentialPair instances">
+      <subsection name="double converted to DerivativeStructure">
 
         <p>
           Another thing that is shown in the previous example is that
-          <code>DifferentialPair</code> instances appear only at method
-          entry and exit. They are not used internally for elementary
-          conversions, only pairs of local variables (or operand stack
-          cells as we will see later) are used.
+          <code>DerivativeStructure</code> instances appear everywhere
+          a double that depends on the input parameter appears, like the
+          result variable and all the temporary variables. However,
+          double that do not depend on the input parameter like the
+          2.0 and 1.0 literal constants remain primitive double values.
         </p>
 
-        <p>
-          For differentiable functions that call other differentiable functions,
-          additional instances of <code>DifferentialPair</code> will be used.
-          Some will be built in the caller to pass parameters to the callee, and
-          others will be build in the callee to return results to the caller.
-          This is <em>not</em> the case for calls to the elementary mathematical
-          functions defined in the <code>Math</code> and <code>StrictMath</code>
-          classes. For these known functions the derivatives computations are inlined.
-          For example a call to the <code>Math.cos</code> function will be inlined
-          as a call to <code>Math.cos</code>, a call to <code>Math.sin</code>
-          and some intermediate arithmetic operations.
-        </p>
+      </subsection>
+
+      <subsection name="no dependency on differentiation order or number of free parameters">
 
         <p>
-          The only methods from the <code>DifferentialPair</code> class that
-          are used by the algorithmic differentiator are the two arguments
-          constructor, the <code>getValue</code> method and the
-          <code>getFirstDerivative</code> method. As far as algorithmic
-          differentiation is concerned, all the other methods could be removed
-          from the class. In fact, they are only present for user convenience
-          if they wish to perform additional computation after a function has
-          been differentiated.
+          The code above also shows that the generated code does not depend
+          on the derivation order or number or free parameters. In fact, this
+          information is only carried at runtime by the <code>DerivativeStructure</code>
+          instance provided as an unput parameters, and the intermediate instances
+          created on the fly will automatically share these values (see the construction
+          of the <code>result</code> variable and the call to <code>getFreeParameters</code>
+          and <code>getOrder</code>).
         </p>
 
       </subsection>
@@ -213,31 +205,27 @@
           href="#Virtual_machine_execution_model">virtual machine execution
           model</a> section) with the instructions that may produce it and the
           instructions that may consume it. This task is realized by the <a
-          href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/TrackingInterpreter.html">
+          href="apidocs/org/apache/commons/nabla/forward/analysis/TrackingInterpreter.html">
           TrackingInterpreter</a> and <a
-          href="apidocs/org/apache/commons/nabla/algorithmic/forward/analysis/TrackingValue.html">
+          href="apidocs/org/apache/commons/nabla/forward/analysis/TrackingValue.html">
           TrackingValue</a> classes.
         </p>
  
         <p>
-          As explained in the <a href="usage.html#Differential_pairs">Differential
-          pairs</a> section of the usage documentation, the signature of
-          the <code>f</code> method is changed. The primitive <code>double</code>
-          <code>t</code> parameter in the original method is changed during
-          the differentiation process to a <code>DifferentialPair</code>
-          instance in the generated derivative. This instance is itself
-          expanded right from the start of the method into a pair of primitive
-          <code>double</code> local variables that can be used throughout the
-          method code.
+          As explained in the <a href="usage.html#double_converted_to_DerivativeStructure">double
+          converted to DerivativeStructure</a> section of the usage documentation,
+          the signature of the <code>value</code> method is changed. The primitive
+          <code>double</code> <code>t</code> parameter in the original method is changed during
+          the differentiation process to a <code>DerivativeStructure</code>
+          instance in the generated derivative.
         </p>
 
         <p>
           All instructions that used the original primitive <code>double</code>
-          parameter must be changed to cope with the new pair of primitive
-          <code>double</code> local variables. In order to do this, the
-          representation of the <code>t</code> parameter in the bytecode is
-          marked as pending conversion from one primitive <code>double</code>
-          to a pair of primitive <code>doubles</code>. Once this data element has
+          parameter must be changed to cope with the new <code>DerivativeStructure</code>
+          local variables. In order to do this, the representation of the <code>t</code>
+          parameter in the bytecode is marked as pending conversion from one primitive
+          <code>double</code> to a <code>DerivativeStructure</code>. Once this data element has
           been marked, the data flow will propagate the mark to other data
           elements (both variables and operand stack cells) thanks to the following
           rules:
@@ -248,7 +236,7 @@
             <li>
               each primitive <code>double</code> data element produced by a changed
               instruction must be marked as pending conversion from one primitive
-              <code>double</code> to a pair of primitive <code>doubles</code>
+              <code>double</code> to a <code>DerivativeStructure</code>
             </li>
           </ul>
           These rules propagate the changes for data and instructions throughout the
@@ -267,7 +255,7 @@
         <p>
           For straightforward smooth functions, the expanded code
           really computes both the value of the equation and its exact
-          derivative. This is a simple application of the
+          derivatives. This is a simple application of the
           differentiation rules. So the accuracy of the derivative
           will be in par with the accuracy of the initial function. If
           the initial function is a good model of a physical process,
@@ -308,27 +296,12 @@
 
     <section name="Implementation">
 
-      <subsection name="Differential pairs">
-          <p><strong>
-              TODO: explain that the various methods provided by DifferentialPair
-              are convenience methods for end users. They can be used to perform
-              some additional transforms on already derived code. BTW, is this feature
-              really useful ?
-          </strong></p>
-      </subsection>
-
       <subsection name="Bytecode transforms">
           <p><strong>
               TODO
           </strong></p>
       </subsection>
 
-      <subsection name="Differentiation rules">
-           <p><strong>
-               TODO: put <em>all</em> rules in a table for reference.
-           </strong></p>
-      </subsection>
-
       <subsection name="Complete differentiation example">
           <p><strong>
             TODO: step by step analysis of the initial example.

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml?rev=1391605&r1=1391604&r2=1391605&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/singularities.xml Fri Sep 28 19:12:53 2012
@@ -32,7 +32,7 @@
           sub-expressions encountered while evaluating the function.
           The function to differentiate may contain conditional or
           discontinuous statements like <code>if/then/else</code>
-          constructs, calls to the <code>Math.floor()</code> method,
+          constructs, calls to the <code>FastMath.floor()</code> method,
           or use of the <code>%</code> operator. From now on, we will
           call <em>branching points</em> the values of the
           <code>t</code> parameter that trigger changes in these
@@ -85,8 +85,8 @@
         </p>
 
         <source>
-          UnivariateDifferentiable singular = new UnivariateDifferentiable() {
-              public double f(double t) {
+          UnivariateFunction singular = new UnivariateFunction() {
+              public double value(double t) {
                   if (t &lt; 0) {
                       return 2 * t;
                   } else {
@@ -133,8 +133,8 @@
         </p>
 
         <source>
-          UnivariateDifferentiable nonSingular = new UnivariateDifferentiable() {
-              public double f(double t) {
+          UnivariateFunction nonSingular = new UnivariateFunction() {
+              public double value(double t) {
                   if (t &lt; 0) {
                       return 2 * t * t;
                   } else {