You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Luc Maisonobe <Lu...@free.fr> on 2013/05/01 11:16:11 UTC
Re: [math] fluent/builder API for algorithms in 4.0 ?
Hi Sebb,
Le 30/04/2013 20:33, sebb a écrit :
> On 30 April 2013 19:12, Phil Steitz <ph...@gmail.com> wrote:
>
>> On 4/30/13 10:27 AM, sebb wrote:
>>> On 30 April 2013 17:28, Phil Steitz <ph...@gmail.com> wrote:
>>>
>>>> On 4/29/13 9:40 AM, Luc Maisonobe wrote:
>>>>> Hi all,
>>>>>
>>>>> Since 2.x series, we have been struggling in several areas with respect
>>>>> to algorithms API. The latest change was about optimizer, but it is
>> only
>>>>> one example among others (solvers, integration, ODE and maybe some
>> parts
>>>>> of statistics may be concerned by the proposal below).
>>>>>
>>>>> The various things we want to keep and which are not always compatible
>>>>> with each others are :
>>>>>
>>>>> 1) simple use
>>>>> 2) immutability
>>>>> 3) good OO design
>>>>> 4) compatible with reference algorithms implementations
>>>>> 5) maintainable
>>>>> 6) extensible
>>>>> 7) backward compatibility
>>>>> 8) probably many other characteristics ...
>>>>>
>>>>> 3) and 4) often don't work together. 1) 6) and 7) are difficult to
>>>>> handle at once.
>>>>>
>>>>> If we look at optimizers, some progress have been with optimizers with
>>>>> respect to extensibility and backward compatibility, but simple use was
>>>>> clearly left behind as it is difficult to know which optimizer support
>>>>> which feature as neither strong typing nor fixed arguments are used
>>>>> anymore. However, keeping the older API would have prevented
>>>>> extensibility as the combinatorial explosion of arguments increases as
>>>>> features are added (and we still need to add several constraints
>> types).
>>>>>
>>>>> If we look at ODE solvers, we are still using the original API from
>>>>> mantissa, but when we add a new feature, we add more and more setters,
>>>>> thus going farther and farther from immutability, and imposing some
>>>>> unwritten scheduling between calls (for example when we set up
>>>>> additional equations, we must also set up the initial additional state,
>>>>> and the user must set up a way to retrieve the final additional state).
>>>>>
>>>>> If we look at solvers, we started with some parameters set up during
>> the
>>>>> call to solve while other were set up at construction time, but this
>>>>> repartition has changed along time.
>>>>>
>>>>> So I would like to suggest a new approach, which has been largely
>>>>> inspired by a recent discussion on the [CSV] component about the
>> builder
>>>>> API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older
>>>>> discussion on [math] about using fluen API for vectors (see
>>>>> <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone
>>>>> gave last year at ApacheCon Europe. The idea is to use fluent API to
>>>>> build progressively the algorithm adding features one at a time using
>>>>> withXxx methods defined in interfaces.
>>>>>
>>>>> As an example, consider just a few features used in optimization:
>>>>> constraints, iteration limit, evaluations limits, search interval,
>>>>> bracketing steps ... Some features are used in several optimizers, some
>>>>> are specific to univariate solvers, some can be used in a family of
>>>>> solvers ... Trying to fit everything in a single class hierarchy is
>>>>> impossible. We tried, but I don't think we succeeded.
>>>>>
>>>>> If we consider separately each features, we could have interfaces
>>>>> defined for each one as follows:
>>>>>
>>>>> interface Constrainable<T extends Constrainable<T>>
>>>>> extends Optimizer {
>>>>> /** Returns a new optimizer, handling an additional constraint.
>>>>> * @param c the constraint to add
>>>>> * @return a new optimizer handling the constraint
>>>>> * (note that the instance itself is not changed
>>>>> */
>>>>> T withConstraint(Constraint c);
>>>>> }
>>>>>
>>>>> Basically they would be used where OptimizationData is used today.
>>>>> An optimizer that supports simple bounds constraints and max iterations
>>>>> would be defined as :
>>>>>
>>>>> public class TheOptimizer
>>>>> implements Optimizer,
>>>>> Constrainable<TheOptimizer>,
>>>>> IterationLimited<TheOptimizer> {
>>>>>
>>>>> private final int maxIter;
>>>>> private final List<Constraint> constraints;
>>>>>
>>>>> // internal constructor used for fluent API
>>>>> private TheOptimizer(..., int maxIter, List<Constraint> list) {
>>>>> ...
>>>>> this.maxIter = m;
>>>>> this.constraints = l;
>>>>> }
>>>>>
>>>>> public TheOptimizer withConstraint(Constraint c) {
>>>>> List<Constraint> l = new ArrayList<Constraint>(constraints);
>>>>> l.add(c);
>>>>> return new TheOptimizer(..., maxIter, l);
>>>>> }
>>>>>
>>>>> public TheOptimizer withMaxIter(int maxIter m) {
>>>>> return new TheOptimizer(..., m, constraints);
>>>>> }
>>>>>
>>>>> }
>>>>>
>>>>> So basically, the withXxx are sort-of setters, but they do preserve
>>>>> immutability (we do not return this, we return a new object). It is
>> easy
>>>>> to add features to existing classes and there is no need to shove
>>>>> everythin within a single hierarchy, we have a forest, not a tree. When
>>>>> looking at the API, users clearly see what the can use and what they
>>>>> cannot use: if an optimizer does not support constraint, there will be
>>>>> no way to put a constraint into it. If in a later version constraints
>>>>> become available, the existing functions will not be changed, only new
>>>>> functions will appear.
>>>>>
>>>>> Of course, this creates a bunch of intermediate objects, but they are
>>>>> often quite small and the setting part is not the most
>>>>> computation-intensive one. It becomes also possible to do some
>>>>> parametric studies on some features, using code like:
>>>>>
>>>>> Algorithm core = new Algorithm().withA(a).withB(b).withC(c);
>>>>> for (double d = 0.0; d < 1.0; d += 0.001) {
>>>>> Algorithm dSpecial = core.withD(d);
>>>>> double result = dSpecial.run();
>>>>> System.out.println(" d = " + d + ", result = " + result);
>>>>> }
>>>>>
>>>>> This would work for someone considering feature A is a core feature
>> that
>>>>> should be fixed but feature D is a parameter, but this would equally
>>>>> well work for someone considering the opposite case: they will simply
>>>>> write the loop the other way, the call to withD being outside of the
>>>>> loop and the call to withA being insided the loop.
>>>>>
>>>>> A side effect is also that it becomes possible to copy safely
>> algorithms
>>>>> by just resetting a feature, even when we don't really know what
>>>>> implementation we have. A typical example I have that creates problems
>>>>> to me is duplicating an ODE solver. It cannot be done currently, as
>> some
>>>>> specific elements are required at construction time that depend on the
>>>>> exact type of solver you use (tolerance vectors for adaptive stepsize
>>>>> integrators). So if for example I want to do some Monte-Carlo analysis
>>>>> in parallel and need to duplicate an integrator,
>>>>> I would do it as follows:
>>>>>
>>>>> void FirstOrderIntegrator[]
>>>>> duplicate(FirstOrderIntegrator integrator, int n) {
>>>>> FirstOrderIntegrator copies = new FirstOrderIntegrator[n];
>>>>> for (int i = 0; i < n; ++i) {
>>>>> copies[i] =
>>>>> integrator.withMaxEvaluations(integrator.getMaxEvaluations());
>>>>> }
>>>>> return copies;
>>>>> }
>>>>>
>>>>> This kind of API could be extended to several algorithms, so it may be
>>>>> set up in a consistend way accross the library. As I wrote at the
>>>>> beginning of this message, I first think about root solvers, optimizers
>>>>> and ODE.
>>>>>
>>>>> What do you think?
>>>> I don't have experience implementing this pattern, so don't know
>>>> what best practices / pitfalls there may be. IIRC, what you are
>>>> getting is formal immutability and more compact code due to withXxx
>>>> inline instead of setXxx at the expense of lots of extra instance
>>>> creation. I guess the latter is not something to worry much about
>>>> in today's world. We just have to be careful with the pattern in
>>>> the cases where constructors actually do something. What is not
>>>> crystal clear to me is what exactly you get in flexibility beyond
>>>> what you would get just adding setters ad hoc (as the withXxx stuff
>>>> effectively does). It is interesting as well to consider the
>>>> reasons that we favor immutability and whether or not this approach
>>>> actually improves anything (e.g., concurrency: yes, helps; path /
>>>> state complexity: not so much).
>>>>
>>>>
>>> Huh? state complexity is surely much reduced if fields cannot be changed
>>> after instance construction.
>>
>> By - naive, syntactical - definition, yes. But what is important is
>> the overall state / path complexity of whatever app is using this
>> stuff. It is not clear to me that spreading that mutability over
>> lots of intermediary instances is any better than just mutating a
>> single instance.
>>
>>
> In the implementations I have seen, the withXxxx methods are only used
> during instance creation.
> Effectively they are equivalent to a ctor with lots of parameters, and the
> created instance has final fields.
This is the intended use. As out algorithms have more and more features,
some optional some mandatory, using a long list of constructor
parameters becomes cumbersome. Also the instance creation can be split
in one or two places, like for example a global system setting in a
library and an application specific tuning afterwards.
This can also be used to implement the prototype designe pattern (my
example about duplicating ODE solver typically does that).
>
> i.e. I assume that
> Algorithm dSpecial = core.withD(d);
> would create a new instance of Algorithm with this.d = d; where d is final.
Yes, this is what will be done.
>
> Of course if that is not the case here, then the withXxxx methods are no
> different from renamed setXxxx methods.
>
> Another approach is to use a builder where the withXxx methods update a
> temporary class with changes. These then need to be converted to an
> immutable instance, e.g. with a build() method.
>
> I've also seen suggestions of using
>
> Algorithm(withXxx().withYyy()...);
>
> where the with methods again update a temporary class; the advantage is
> that no build() method is needed.
> The compiler processes the with() chain and passes a single class instance
> which is then used to create the final object.
This would prevent both the multi-stage setting and the prototyping
features, which I find useful.
Luc
>
> Phil
>>>
>>>
>>>> Phil
>>>>
>>>>
>>>>> Luc
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org