You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Luc Maisonobe <Lu...@free.fr> on 2013/05/01 11:16:11 UTC
Re: [math] fluent/builder API for algorithms in 4.0 ?

Hi Sebb,

Le 30/04/2013 20:33, sebb a écrit :
> On 30 April 2013 19:12, Phil Steitz <ph...@gmail.com> wrote:
> 
>> On 4/30/13 10:27 AM, sebb wrote:
>>> On 30 April 2013 17:28, Phil Steitz <ph...@gmail.com> wrote:
>>>
>>>> On 4/29/13 9:40 AM, Luc Maisonobe wrote:
>>>>> Hi all,
>>>>>
>>>>> Since 2.x series, we have been struggling in several areas with respect
>>>>> to algorithms API. The latest change was about optimizer, but it is
>> only
>>>>> one example among others (solvers, integration, ODE and maybe some
>> parts
>>>>> of statistics may be concerned by the proposal below).
>>>>>
>>>>> The various things we want to keep and which are not always compatible
>>>>> with each others are :
>>>>>
>>>>>  1) simple use
>>>>>  2) immutability
>>>>>  3) good OO design
>>>>>  4) compatible with reference algorithms implementations
>>>>>  5) maintainable
>>>>>  6) extensible
>>>>>  7) backward compatibility
>>>>>  8) probably many other characteristics ...
>>>>>
>>>>> 3) and 4) often don't work together. 1) 6) and 7) are difficult to
>>>>> handle at once.
>>>>>
>>>>> If we look at optimizers, some progress have been with optimizers with
>>>>> respect to extensibility and backward compatibility, but simple use was
>>>>> clearly left behind as it is difficult to know which optimizer support
>>>>> which feature as neither strong typing nor fixed arguments are used
>>>>> anymore. However, keeping the older API would have prevented
>>>>> extensibility as the combinatorial explosion of arguments increases as
>>>>> features are added (and we still need to add several constraints
>> types).
>>>>>
>>>>> If we look at ODE solvers, we are still using the original API from
>>>>> mantissa, but when we add a new feature, we add more and more setters,
>>>>> thus going farther and farther from immutability, and imposing some
>>>>> unwritten scheduling between calls (for example when we set up
>>>>> additional equations, we must also set up the initial additional state,
>>>>> and the user must set up a way to retrieve the final additional state).
>>>>>
>>>>> If we look at solvers, we started with some parameters set up during
>> the
>>>>> call to solve while other were set up at construction time, but this
>>>>> repartition has changed along time.
>>>>>
>>>>> So I would like to suggest a new approach, which has been largely
>>>>> inspired by a recent discussion on the [CSV] component about the
>> builder
>>>>> API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older
>>>>> discussion on [math] about using fluen API for vectors (see
>>>>> <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone
>>>>> gave last year at ApacheCon Europe. The idea is to use fluent API to
>>>>> build progressively the algorithm adding features one at a time using
>>>>> withXxx methods defined in interfaces.
>>>>>
>>>>> As an example, consider just a few features used in optimization:
>>>>> constraints, iteration limit, evaluations limits, search interval,
>>>>> bracketing steps ... Some features are used in several optimizers, some
>>>>> are specific to univariate solvers, some can be used in a family of
>>>>> solvers ... Trying to fit everything in a single class hierarchy is
>>>>> impossible. We tried, but I don't think we succeeded.
>>>>>
>>>>> If we consider separately each features, we could have interfaces
>>>>> defined for each one as follows:
>>>>>
>>>>>   interface Constrainable<T extends Constrainable<T>>
>>>>>             extends Optimizer {
>>>>>     /** Returns a new optimizer, handling an additional constraint.
>>>>>       * @param c the constraint to add
>>>>>       * @return a new optimizer handling the constraint
>>>>>       * (note that the instance itself is not changed
>>>>>       */
>>>>>     T withConstraint(Constraint c);
>>>>>   }
>>>>>
>>>>> Basically they would  be used where OptimizationData is used today.
>>>>> An optimizer that supports simple bounds constraints and max iterations
>>>>> would be defined as :
>>>>>
>>>>>   public class TheOptimizer
>>>>>          implements Optimizer,
>>>>>                     Constrainable<TheOptimizer>,
>>>>>                     IterationLimited<TheOptimizer> {
>>>>>
>>>>>     private final int maxIter;
>>>>>     private final List<Constraint> constraints;
>>>>>
>>>>>     // internal constructor used for fluent API
>>>>>     private TheOptimizer(..., int maxIter, List<Constraint> list) {
>>>>>       ...
>>>>>       this.maxIter     = m;
>>>>>       this.constraints = l;
>>>>>     }
>>>>>
>>>>>     public TheOptimizer withConstraint(Constraint c) {
>>>>>       List<Constraint> l = new ArrayList<Constraint>(constraints);
>>>>>       l.add(c);
>>>>>       return new TheOptimizer(..., maxIter, l);
>>>>>     }
>>>>>
>>>>>     public TheOptimizer withMaxIter(int maxIter m) {
>>>>>       return new TheOptimizer(..., m, constraints);
>>>>>     }
>>>>>
>>>>>  }
>>>>>
>>>>> So basically, the withXxx are sort-of setters, but they do preserve
>>>>> immutability (we do not return this, we return a new object). It is
>> easy
>>>>> to add features to existing classes and there is no need to shove
>>>>> everythin within a single hierarchy, we have a forest, not a tree. When
>>>>> looking at the API, users clearly see what the can use and what they
>>>>> cannot use: if an optimizer does not support constraint, there will be
>>>>> no way to put a constraint into it. If in a later version constraints
>>>>> become available, the existing functions will not be changed, only new
>>>>> functions will appear.
>>>>>
>>>>> Of course, this creates a bunch of intermediate objects, but they are
>>>>> often quite small and the setting part is not the most
>>>>> computation-intensive one. It becomes also possible to do some
>>>>> parametric studies on some features, using code like:
>>>>>
>>>>>   Algorithm core = new Algorithm().withA(a).withB(b).withC(c);
>>>>>   for (double d = 0.0; d < 1.0; d += 0.001) {
>>>>>     Algorithm dSpecial = core.withD(d);
>>>>>     double result = dSpecial.run();
>>>>>     System.out.println(" d = " + d + ", result = " + result);
>>>>>   }
>>>>>
>>>>> This would work for someone considering feature A is a core feature
>> that
>>>>> should be fixed but feature D is a parameter, but this would equally
>>>>> well work for someone considering the opposite case: they will simply
>>>>> write the loop the other way, the call to withD being outside of the
>>>>> loop and the call to withA being insided the loop.
>>>>>
>>>>> A side effect is also that it becomes possible to copy safely
>> algorithms
>>>>> by just resetting a feature, even when we don't really know what
>>>>> implementation we have. A typical example I have that creates problems
>>>>> to me is duplicating an ODE solver. It cannot be done currently, as
>> some
>>>>> specific elements are required at construction time that depend on the
>>>>> exact type of solver you use (tolerance vectors for adaptive stepsize
>>>>> integrators). So if for example I want to do some Monte-Carlo analysis
>>>>> in parallel and need to duplicate an integrator,
>>>>> I would do it as follows:
>>>>>
>>>>>   void FirstOrderIntegrator[]
>>>>>       duplicate(FirstOrderIntegrator integrator, int n) {
>>>>>     FirstOrderIntegrator copies = new FirstOrderIntegrator[n];
>>>>>     for (int i = 0; i < n; ++i) {
>>>>>       copies[i] =
>>>>>         integrator.withMaxEvaluations(integrator.getMaxEvaluations());
>>>>>     }
>>>>>     return copies;
>>>>>   }
>>>>>
>>>>> This kind of API could be extended to several algorithms, so it may be
>>>>> set up in a consistend way accross the library. As I wrote at the
>>>>> beginning of this message, I first think about root solvers, optimizers
>>>>> and ODE.
>>>>>
>>>>> What do you think?
>>>> I don't have experience implementing this pattern, so don't know
>>>> what best practices / pitfalls there may be.  IIRC, what you are
>>>> getting is formal immutability and more compact code due to withXxx
>>>> inline instead of setXxx at the expense of lots of extra instance
>>>> creation.  I guess the latter is not something to worry much about
>>>> in today's world.  We just have to be careful with the pattern in
>>>> the cases where constructors actually do something.  What is not
>>>> crystal clear to me is what exactly you get in flexibility beyond
>>>> what you would get just adding setters ad hoc (as the withXxx stuff
>>>> effectively does).  It is interesting as well to consider the
>>>> reasons that we favor immutability and whether or not this approach
>>>> actually improves anything (e.g., concurrency: yes, helps; path /
>>>> state complexity: not so much).
>>>>
>>>>
>>> Huh? state complexity is surely much reduced if fields cannot be changed
>>> after instance construction.
>>
>> By - naive, syntactical - definition, yes. But what is important is
>> the overall state / path complexity of whatever app is using this
>> stuff.  It is not clear to me that spreading that mutability over
>> lots of intermediary instances is any better than just mutating a
>> single instance.
>>
>>
> In the implementations I have seen, the withXxxx methods are only used
> during instance creation.
> Effectively they are equivalent to a ctor with lots of parameters, and the
> created instance has final fields.

This is the intended use. As out algorithms have more and more features,
some optional some mandatory, using a long list of constructor
parameters becomes cumbersome. Also the instance creation can be split
in one or two places, like for example a global system setting in a
library and an application specific tuning afterwards.

This can also be used to implement the prototype designe pattern (my
example about duplicating ODE solver typically does that).

> 
> i.e. I assume that
> Algorithm dSpecial = core.withD(d);
> would create a new instance of Algorithm with this.d = d; where d is final.

Yes, this is what will be done.

> 
> Of course if that is not the case here, then the withXxxx methods are no
> different from renamed setXxxx methods.
> 
> Another approach is to use a builder where the withXxx methods update a
> temporary class with changes. These then need to be converted to an
> immutable instance, e.g. with a build() method.
> 
> I've also seen suggestions of using
> 
> Algorithm(withXxx().withYyy()...);
> 
> where the with methods again update a temporary class; the advantage is
> that no build() method is needed.
> The compiler processes the with() chain and passes a single class instance
> which is then used to create the final object.

This would prevent both the multi-stage setting and the prototyping
features, which I find useful.

Luc

> 
> Phil
>>>
>>>
>>>> Phil
>>>>
>>>>
>>>>> Luc
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org