You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@commons.apache.org by Luc Maisonobe <lu...@spaceroots.org> on 2012/12/28 15:01:25 UTC

[math] major problem with new released version 3.1

Hi all,

I have encountered a major problem with version 3.1 just released.
This problem occurs in the revamped optimizers. One of the
OptimizationData implementations we can pass to multivariate vector
optimizers is the weight of the various observations.

In all cases I can think of, this weight is a vector, but it is in fact
stored as a matrix in which only the diagonal is set. It's main
constructor uses a simple double array (which is used in all test cases
and is also used under the hood by curve fitters), and some optimizers
also assume only the diagonal is set (for example the Gauss-Newton
optimizer extract only the diagonal elements and ignores the other elements.

However, it is also possible to set a non-diagonal weight matrix, and
one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
on the matrix to extract its square root. I don't know any use case for
this, but it has been set up this way, so I guess someone has a use for
non-diagonal weights.

However, merging all cases (vector and matrix) into one (despite the
fact Gauss-Newton ignores non-diagonal elements) leads to huge problems.
I discovered this when attempting a simple polynomial curve fitting on a
large number of points (41201 points). In this case, a full weight
matrix with about 1.7 billions entries was created, and copied... I
ended up with a memory error for something that should have been
completely trivial (fitting a polynomial, when in fact all weights were
equal to 1.0)! So for now, the new curve fitter is really impossible to
use with a large number of points.

So I would like to be able to cope with vector-only weight easily. As I
don't understand if matrix weight are useful or not, I have simply added
a new WeightVector class implementing OptimizationData and added it to
the various parseOptimizationData, while preserving the former matrix
weights class. This does work well and changing the optimizers to
directly use a vector was really straightforward as in fact the weights
are not really used as a matrix (except in the strange eigen
decomposition and associated Jacobian multiplication of
AbstractLeastSquaresOptimizer).

I wonder if I should simply add this as is or if we should rather remove
the non-diagonal weights feature and support only vector weights.

I also wonder if we should release a bug fix version early.

best regards,
Luc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

>> 
>> Several times I brought out some problems and ideas about the package and it seems the only person who has an opinion is Gilles.
>> 
>> I will list what I consider to be major problems
>> 1) The OO design is bad, too much inheritance which could be handled better by interfaces,
> We may have gone too far in favoring abstract classes over
> interfaces.  What led to this historically was getting "locked in"
> to incomplete interface designs due to their immutability wrt
> backward compatibility.
>> the structure has no relation to the actual way parts of optimizers can be mixed and matched. Input functions should also have clear inheritance structure and should return both the value, gradient, hessian in one function call.
>> 2) Incorrect handling of constraints. There are only something like 5 possible constraints possible in optimization, with each implementation of the solver handling some but not all. There is no need to this runtime approach, which creates incredible amount of confusion. All the possible constraints should be explicitly given in the parameters to a function call, there are only 5. In addition, constraints should be pre-processed a priori! So they should be an input to the constructor not to the optimization function call.
>> 3) Linear algebra package should be used as an interface and internally to increase performance for larger datasets. Data copying should be avoided as much as possible.
> 
> Can you provide a little more detail and maybe some sample code
> illustrating above?

Are you referring to the linear algebra comment? I have no examples. I was talking in general terms. For example the covariance matrix is typically sparse, as I think would be its inverse (?). That means its possible to actually provide an option of a general weight matrix, if we used the linear algebra framework. The user could simply create a sparse matrix instead of a RealMatrix, and pass it on to the solver. The code inside the optimizer would naturally take advantage its sparsity due to inheritance and speedup its inversion as well as its multiplication by other matrices. This would also help with sparse Jacobians in the least-squares method. Diagonal matrix is a special case of sparse matrix, so we could actually create a class for that. That was the general code can remain, and both Matrix and vector weights would be automatically handled efficiently. 


>> 4) Testing should be done on larger problems.
> +1 - I think it would be great to add some test classes not executed
> on each build to test large problems.  Do you have some we can use?
> 

I don't know much about testing libraries, only know that standard benchmarks exist. I would assume it's not hard to create a larger inverse problem by simply randomly generating several examples. But standard benchmarks would be better.

> Phil
>> I know the response is that I am free to go implemented, but I think we should at least agree on design principles instead of pushing through your own ideas because the other person is too busy. The only discussion we ever had on this was between me and Gilles, everyone else disappeared. 
>> 
>> Thanks,
>> Konstantin
>> 
>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com> wrote:
>> 
>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>>> Luc,
>>>> 
>>>>> So in order to make sure I understand your point, you would be OK
>>>>> if I
>>>>> deprecate the non-diagonal weights, in which case users needing this
>>>>> would have to implement it themselves by premultiplication (as
>>>>> both you
>>>>> and Konstantin seem to propose)?
>>>> Yes, exactly.
>>>> 
>>>>> Sure, but for the record the feature was also a last minute
>>>>> change. This
>>>>> was discussed on the list, and the final decision was to add this
>>>>> feature despite the release was close. No wonder we failed to
>>>>> test it
>>>>> thoroughsly.
>>>> Last minute?  I have been discussing this with Gilles for several
>>>> months.
>>> Relevant project discussion happens *on this list*
>>>>> We don't expect our releases to be perfect. We do our best, with the
>>>>> resources we have.
>>>> I perfectly understand this but focusing those resources less on
>>>> rules
>>>> and more on real cases might help.
>>> As stated before, you are more than welcome to *become* one of these
>>> resources.
>>> 
>>> Phil
>>>> Regards,
>>>> Dim.
>>>> ----------------------------------------------------------------------------
>>>> 
>>>> Dimitri Pourbaix                         *      Don't worry, be happy
>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
>>>> CP 226, office 2.N4.211, building NO     *
>>>> Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
>>>> Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
>>>> B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
>>>> http://sb9.astro.ulb.ac.be/~pourbaix     *
>>>> mailto:pourbaix@astro.ulb.ac.be
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

On Dec 28, 2012, at 12:23 PM, Dimitri Pourbaix <po...@astro.ulb.ac.be> wrote:

> Hi,
> 
>>> I can understand Dimitri's frustration, it seems the optimization
>> framework gets worse with every iteration. However, we should probably
>> look forward and think about how to design it properly instead.
>> 
>> +1 - though we have the constraint that we need to maintain backward
>> compatibility until 4.0.  If we decide it is too far gone, we have
>> to drop that and either cut 4.0 "quickly" (will probably be hard for
>> us, given all of the other refactoring we want to get done for 4.0)
>> or make an exception.
> 
> I have no frustration wrt CM.  I only think that rather than wanting
> CM to handle everything in every possible way, its developers should
> limit the implementation (e.g. rely upon user's pre-multiplication
> rather than offering a matrix of weights approach) and make sure those
> are intensively tested and bug free.

Maybe it is better if the OO design provided basic optimization functionality with a set of wrappers that would allow the user to quickly implement a weighted least-squares approach, instead of complicating the interface to all the solvers every time we want to add another feature. The weights are not part of the optimization algorithm.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Hi,

>> I can understand Dimitri's frustration, it seems the optimization
> framework gets worse with every iteration. However, we should probably
> look forward and think about how to design it properly instead.
>
> +1 - though we have the constraint that we need to maintain backward
> compatibility until 4.0.  If we decide it is too far gone, we have
> to drop that and either cut 4.0 "quickly" (will probably be hard for
> us, given all of the other refactoring we want to get done for 4.0)
> or make an exception.

I have no frustration wrt CM.  I only think that rather than wanting
CM to handle everything in every possible way, its developers should
limit the implementation (e.g. rely upon user's pre-multiplication
rather than offering a matrix of weights approach) and make sure those
are intensively tested and bug free.

>> 4) Testing should be done on larger problems.
> +1 - I think it would be great to add some test classes not executed
> on each build to test large problems.  Do you have some we can use?

In the framework of a space mission, I do test some aspects of the linear
algebra package (QR, SVD) and optimization (LevenbergMarquardt) on millions
of problems.

Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/28/12 8:51 AM, Konstantin Berlin wrote:
> Hi,
>
> I can understand Dimitri's frustration, it seems the optimization framework gets worse with every iteration. However, we should probably look forward and think about how to design it properly instead. 

+1 - though we have the constraint that we need to maintain backward
compatibility until 4.0.  If we decide it is too far gone, we have
to drop that and either cut 4.0 "quickly" (will probably be hard for
us, given all of the other refactoring we want to get done for 4.0)
or make an exception. 

>
> Several times I brought out some problems and ideas about the package and it seems the only person who has an opinion is Gilles.
>
> I will list what I consider to be major problems
> 1) The OO design is bad, too much inheritance which could be handled better by interfaces,
We may have gone too far in favoring abstract classes over
interfaces.  What led to this historically was getting "locked in"
to incomplete interface designs due to their immutability wrt
backward compatibility.
>  the structure has no relation to the actual way parts of optimizers can be mixed and matched. Input functions should also have clear inheritance structure and should return both the value, gradient, hessian in one function call.
> 2) Incorrect handling of constraints. There are only something like 5 possible constraints possible in optimization, with each implementation of the solver handling some but not all. There is no need to this runtime approach, which creates incredible amount of confusion. All the possible constraints should be explicitly given in the parameters to a function call, there are only 5. In addition, constraints should be pre-processed a priori! So they should be an input to the constructor not to the optimization function call.
> 3) Linear algebra package should be used as an interface and internally to increase performance for larger datasets. Data copying should be avoided as much as possible.

Can you provide a little more detail and maybe some sample code
illustrating above?
> 4) Testing should be done on larger problems.
+1 - I think it would be great to add some test classes not executed
on each build to test large problems.  Do you have some we can use?

Phil
> I know the response is that I am free to go implemented, but I think we should at least agree on design principles instead of pushing through your own ideas because the other person is too busy. The only discussion we ever had on this was between me and Gilles, everyone else disappeared. 
>
> Thanks,
> Konstantin
>
> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com> wrote:
>
>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>> Luc,
>>>
>>>> So in order to make sure I understand your point, you would be OK
>>>> if I
>>>> deprecate the non-diagonal weights, in which case users needing this
>>>> would have to implement it themselves by premultiplication (as
>>>> both you
>>>> and Konstantin seem to propose)?
>>> Yes, exactly.
>>>
>>>> Sure, but for the record the feature was also a last minute
>>>> change. This
>>>> was discussed on the list, and the final decision was to add this
>>>> feature despite the release was close. No wonder we failed to
>>>> test it
>>>> thoroughsly.
>>> Last minute?  I have been discussing this with Gilles for several
>>> months.
>> Relevant project discussion happens *on this list*
>>>> We don't expect our releases to be perfect. We do our best, with the
>>>> resources we have.
>>> I perfectly understand this but focusing those resources less on
>>> rules
>>> and more on real cases might help.
>> As stated before, you are more than welcome to *become* one of these
>> resources.
>>
>> Phil
>>> Regards,
>>> Dim.
>>> ----------------------------------------------------------------------------
>>>
>>> Dimitri Pourbaix                         *      Don't worry, be happy
>>> Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
>>> CP 226, office 2.N4.211, building NO     *
>>> Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
>>> Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
>>> B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
>>> http://sb9.astro.ulb.ac.be/~pourbaix     *
>>> mailto:pourbaix@astro.ulb.ac.be
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

The GPU requires native code that is executed on the GPU.  Standard linear
algebra libraries exist for this so if the API can express a standard
linear algebra routine concisely, then the GPU can be used.  General Java
code usually can't be executed on a GPU.

There is some late breaking news on this front, but the way to get
performance is generally to recognize standard idioms that have accelerated
implementations.

In Mahout, for instance, we can recognize many linear algebra operations
from idiomatic use of visitor patterns.  For instance, in this code,

      Vector u, v;
      v.assign(Functions.PLUS, u);

Mahout will recognize the call to assign as a vector addition. This is easy
with vector operations but much harder to recognize matrix operations
expressed with simple visitor patterns.

On Sun, Dec 30, 2012 at 11:26 AM, Sébastien Brisard <
sebastien.brisard@m4x.org> wrote:

> > and hence preclude vector based process operations, such as you would
> find
> > on a GPU. So if the user wanted to speed up the computation using a GPU
> > they would not be able to do it, if we base it on a single element at a
> > time visitor pattern.
> >
> >
> I fail to see how the GPU could not be used. I am no expert on GPU
> programming, but I can easily imagine a new implementation of RealVector,
> say GpuBasedRealVector, where the walkInDefaultOrder method would send
> multiple values at a time to the GPU. I've already done that for multi-core
> machines (using fork/join), and the visitor pattern was certainly not a
> limitation.
>

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

On Sun, Dec 30, 2012 at 12:16 PM, Konstantin Berlin <kb...@gmail.com>wrote:

>
> ...
> > There would be no burden on the user's side: the visitor pattern has been
> > implemented for RealVectors in version 3.1. Besides, we could provide all
> > the relevant visitors (addition, scaling, …)
>
> There is an additional burden to the user, since if you require
> implementation of the full RealVector or RealMatrix interface, which
> includes a large set of functions not need for the optimizer, since the
> user has no idea which function you will use inside the optimizer and which
> he can leave blank. For example, if a user needs to create their own
> implementation of a vector multiplication, because they have a special case
> which can be handled faster, or because they use a GPU, they are still
> burdened with implementing unnecessary support for the dozens of others
> functions which will never be used. For a GPU example, like Ted has
> pointed out, they can implement a GPU version for basic operations (add,
> multi, etc), but to guarantee general support for any Java function using
> the visitor pattern the user would also need to implement a Java version of
> the visitor pattern.
>

With a good abstract class to inherit from, this isn't much of a problem in
practice.  You should need to implement very little and you should be able
to over-ride what you will without much danger.

It also helps to have standardized tests that can pretty exhaustively test
new implementations for correctness.  To a surprising extent, this allows
new implementations to be well tested nearly on the day they are written.

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

On Dec 30, 2012, at 2:26 PM, Sébastien Brisard <se...@m4x.org> wrote:

> Hello,
> 
> Using the visitor pattern as a way to implement in-place basic linear
> algebra operations was only a suggestion, I am not sure it's the best
> option, even if I do think it's a simple and elegant option. However...
> 
> 
> I think we might have a misunderstanding. What I am discussing is not the
>> general implementation for a matrix, but the base interface that would be
>> required for input into optimizers. I was saying that there should not be a
>> burden of implementing visitor pattern for users creating a custom class
>> for optimization input, if it is not used internally by the optimizers.
> 
> 
> 
> There would be no burden on the user's side: the visitor pattern has been
> implemented for RealVectors in version 3.1. Besides, we could provide all
> the relevant visitors (addition, scaling, …)

There is an additional burden to the user, since if you require implementation of the full RealVector or RealMatrix interface, which includes a large set of functions not need for the optimizer, since the user has no idea which function you will use inside the optimizer and which he can leave blank. For example, if a user needs to create their own implementation of a vector multiplication, because they have a special case which can be handled faster, or because they use a GPU, they are still burdened with implementing unnecessary support for the dozens of others 
functions which will never be used. For a GPU example, like Ted has pointed out, they can implement a GPU version for basic operations (add, multi, etc), but to guarantee general support for any Java function using the visitor pattern the user would also need to implement a Java version of the visitor pattern.

> .
> 

> 
>> It should only consist of the most general and yet basic set of function
>> that are required. While visitor pattern can probably be used to implement
>> all the functionality required for optimization, it would not be the most
>> general formulation,
> 
> 
> why? I thought that this pattern aimed at generality? I must be missing
> something.

You need to account for the cases when the actual vector addition (multiplication, etc) is offloaded to some black-box module.

> 
> 
>> and hence preclude vector based process operations, such as you would find
>> on a GPU. So if the user wanted to speed up the computation using a GPU
>> they would not be able to do it, if we base it on a single element at a
>> time visitor pattern.
>> 
>> 
> I fail to see how the GPU could not be used. I am no expert on GPU
> programming, but I can easily imagine a new implementation of RealVector,
> say GpuBasedRealVector, where the walkInDefaultOrder method would send
> multiple values at a time to the GPU. I've already done that for multi-core
> machines (using fork/join), and the visitor pattern was certainly not a
> limitation.
> 
> Thanks,
>> Konstantin
>> 
>> 
> Again, I'm not saying that the visitor pattern is THE solution. I'm only
> saying that we should not discard it for wrong reasons. Also, I do think
> that the other proposed solution (duplicate all current implementations of
> RealVector with in-place operations) is very *dangerous*.
> 
> Indeed, depending on x being an instance of ArrayRealVector or
> InPlaceArrayRealVector, x.add(y) would have very different side effects (x
> is or is not changed). This means that any algorithm taking a RealVector as
> an input can no longer make the assumption that the vectors are unchanged.
> In some cases, this would lead to copies of the vector on which some basic
> linear algebra operations must be performed, "just in case". So we're back
> to square one, since the initial motivation for in-place operations is
> limitation of memory footprint.
> 
> If we do not want to go for the visitor pattern (which, again, I'm happy
> with), I think the only option is to hard code *new* methods addToSelf and
> the likes.
> 
> Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Sébastien Brisard <se...@m4x.org>.

Hello,

Using the visitor pattern as a way to implement in-place basic linear
algebra operations was only a suggestion, I am not sure it's the best
option, even if I do think it's a simple and elegant option. However...


I think we might have a misunderstanding. What I am discussing is not the
> general implementation for a matrix, but the base interface that would be
> required for input into optimizers. I was saying that there should not be a
> burden of implementing visitor pattern for users creating a custom class
> for optimization input, if it is not used internally by the optimizers.



There would be no burden on the user's side: the visitor pattern has been
implemented for RealVectors in version 3.1. Besides, we could provide all
the relevant visitors (addition, scaling, ...).


> It should only consist of the most general and yet basic set of function
> that are required. While visitor pattern can probably be used to implement
> all the functionality required for optimization, it would not be the most
> general formulation,


why? I thought that this pattern aimed at generality? I must be missing
something.


> and hence preclude vector based process operations, such as you would find
> on a GPU. So if the user wanted to speed up the computation using a GPU
> they would not be able to do it, if we base it on a single element at a
> time visitor pattern.
>
>
I fail to see how the GPU could not be used. I am no expert on GPU
programming, but I can easily imagine a new implementation of RealVector,
say GpuBasedRealVector, where the walkInDefaultOrder method would send
multiple values at a time to the GPU. I've already done that for multi-core
machines (using fork/join), and the visitor pattern was certainly not a
limitation.

Thanks,
> Konstantin
>
>
Again, I'm not saying that the visitor pattern is THE solution. I'm only
saying that we should not discard it for wrong reasons. Also, I do think
that the other proposed solution (duplicate all current implementations of
RealVector with in-place operations) is very *dangerous*.

Indeed, depending on x being an instance of ArrayRealVector or
InPlaceArrayRealVector, x.add(y) would have very different side effects (x
is or is not changed). This means that any algorithm taking a RealVector as
an input can no longer make the assumption that the vectors are unchanged.
In some cases, this would lead to copies of the vector on which some basic
linear algebra operations must be performed, "just in case". So we're back
to square one, since the initial motivation for in-place operations is
limitation of memory footprint.

If we do not want to go for the visitor pattern (which, again, I'm happy
with), I think the only option is to hard code *new* methods addToSelf and
the likes.

Sébastien

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

Konstantin,

We are close then.  Yes optimization should use vector methods as possible.

But visitor functions are very easy to add in an abstract class.  They
impose very little burden on the implementor.

On Sun, Dec 30, 2012 at 8:52 AM, Konstantin Berlin <kb...@gmail.com>wrote:

> I think we might have a misunderstanding. What I am discussing is not the
> general implementation for a matrix, but the base interface that would be
> required for input into optimizers. I was saying that there should not be a
> burden of implementing visitor pattern for users creating a custom class
> for optimization input, if it is not used internally by the optimizers.

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Hi Ted,

I think we might have a misunderstanding. What I am discussing is not the general implementation for a matrix, but the base interface that would be required for input into optimizers. I was saying that there should not be a burden of implementing visitor pattern for users creating a custom class for optimization input, if it is not used internally by the optimizers. It should only consist of the most general and yet basic set of function that are required. While visitor pattern can probably be used to implement all the functionality required for optimization, it would not be the most general formulation, and hence preclude vector based process operations, such as you would find on a GPU. So if the user wanted to speed up the computation using a GPU they would not be able to do it, if we base it on a single element at a time visitor pattern.

Thanks,
Konstantin

On Dec 29, 2012, at 7:08 PM, Ted Dunning <te...@gmail.com> wrote:

> Who said force?  Linear algebra operations should be available.
> 
> Visitors should be available.
> 
> Your mileage will vary.  That was always true.
> 
> On Sat, Dec 29, 2012 at 3:59 PM, Konstantin Berlin <kb...@gmail.com>wrote:
> 
>> Also. If you have GPU implementation of a matrix, or use another type of a
>> vector processor, there is no way you can program that in if you force
>> vector operations to use a visitor patterns.
>> 
>> On Dec 29, 2012, at 6:43 PM, Konstantin Berlin <kb...@gmail.com> wrote:
>> 
>>> That's a good point about the compiler. I never tested the performance
>> of visitors vs. sequential array access. I just don't want the vector
>> operations to be tied to any particular implementation detail.
>>> 
>>> On Dec 29, 2012, at 6:30 PM, Ted Dunning <te...@gmail.com> wrote:
>>> 
>>>> Actually, the visitor pattern or variants thereof can produce very
>>>> performant linear algebra implementations.  You can't usually get quite
>>>> down to optimized BLAS performance, but you get pretty darned fast code.
>>>> 
>>>> The reason is that the visitor is typically a very simple class which is
>>>> immediately inlined by the JIT.  Then it is subject to all of the normal
>>>> optimizations exactly as if the code were written as a single concrete
>>>> loop.  For many implementations, the bounds checks will be hoisted out
>> of
>>>> the loop so you get pretty decent code.
>>>> 
>>>> More importantly in many cases, visitors allow in place algorithms.
>>>> Combined with view operators that limit visibility to part of a matrix,
>>>> and the inlining phenomenon mentioned above, this can have enormous
>>>> implications to performance.
>>>> 
>>>> A great case in point is the Mahout math library.  With no special
>> efforts
>>>> taken and using the visitor style fairly ubiquitously, I can get about
>> 2 G
>>>> flops from my laptop.  Using Atlas as a LINPAK implementation gives me
>>>> about 5 G flops.
>>>> 
>>>> I agree with the point that linear algebra operators should be used
>> where
>>>> possible, but that just isn't feasible for lots of operations in real
>>>> applications.  Getting solid performance with simple code in those
>>>> applications is a real virtue.
>>>> 
>>>> On Sat, Dec 29, 2012 at 3:22 PM, Konstantin Berlin <kberlin@gmail.com
>>> wrote:
>>>> 
>>>>> While visitor pattern is a good abstraction, I think it would make for
>>>>> terrible linear algebra performance. All operations should be based on
>>>>> basic vector operations, which internally can take advantage of
>> sequential
>>>>> memory access. For large problems it makes a difference. The visitor
>>>>> pattern is a nice add on, but it should not be the engine driving the
>>>>> package under the hood, in my opinion.
>>>>> 
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

Who said force?  Linear algebra operations should be available.

Visitors should be available.

Your mileage will vary.  That was always true.

On Sat, Dec 29, 2012 at 3:59 PM, Konstantin Berlin <kb...@gmail.com>wrote:

> Also. If you have GPU implementation of a matrix, or use another type of a
> vector processor, there is no way you can program that in if you force
> vector operations to use a visitor patterns.
>
> On Dec 29, 2012, at 6:43 PM, Konstantin Berlin <kb...@gmail.com> wrote:
>
> > That's a good point about the compiler. I never tested the performance
> of visitors vs. sequential array access. I just don't want the vector
> operations to be tied to any particular implementation detail.
> >
> > On Dec 29, 2012, at 6:30 PM, Ted Dunning <te...@gmail.com> wrote:
> >
> >> Actually, the visitor pattern or variants thereof can produce very
> >> performant linear algebra implementations.  You can't usually get quite
> >> down to optimized BLAS performance, but you get pretty darned fast code.
> >>
> >> The reason is that the visitor is typically a very simple class which is
> >> immediately inlined by the JIT.  Then it is subject to all of the normal
> >> optimizations exactly as if the code were written as a single concrete
> >> loop.  For many implementations, the bounds checks will be hoisted out
> of
> >> the loop so you get pretty decent code.
> >>
> >> More importantly in many cases, visitors allow in place algorithms.
> >> Combined with view operators that limit visibility to part of a matrix,
> >> and the inlining phenomenon mentioned above, this can have enormous
> >> implications to performance.
> >>
> >> A great case in point is the Mahout math library.  With no special
> efforts
> >> taken and using the visitor style fairly ubiquitously, I can get about
> 2 G
> >> flops from my laptop.  Using Atlas as a LINPAK implementation gives me
> >> about 5 G flops.
> >>
> >> I agree with the point that linear algebra operators should be used
> where
> >> possible, but that just isn't feasible for lots of operations in real
> >> applications.  Getting solid performance with simple code in those
> >> applications is a real virtue.
> >>
> >> On Sat, Dec 29, 2012 at 3:22 PM, Konstantin Berlin <kberlin@gmail.com
> >wrote:
> >>
> >>> While visitor pattern is a good abstraction, I think it would make for
> >>> terrible linear algebra performance. All operations should be based on
> >>> basic vector operations, which internally can take advantage of
> sequential
> >>> memory access. For large problems it makes a difference. The visitor
> >>> pattern is a nice add on, but it should not be the engine driving the
> >>> package under the hood, in my opinion.
> >>>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Also. If you have GPU implementation of a matrix, or use another type of a vector processor, there is no way you can program that in if you force vector operations to use a visitor patterns. 

On Dec 29, 2012, at 6:43 PM, Konstantin Berlin <kb...@gmail.com> wrote:

> That's a good point about the compiler. I never tested the performance of visitors vs. sequential array access. I just don't want the vector operations to be tied to any particular implementation detail.
> 
> On Dec 29, 2012, at 6:30 PM, Ted Dunning <te...@gmail.com> wrote:
> 
>> Actually, the visitor pattern or variants thereof can produce very
>> performant linear algebra implementations.  You can't usually get quite
>> down to optimized BLAS performance, but you get pretty darned fast code.
>> 
>> The reason is that the visitor is typically a very simple class which is
>> immediately inlined by the JIT.  Then it is subject to all of the normal
>> optimizations exactly as if the code were written as a single concrete
>> loop.  For many implementations, the bounds checks will be hoisted out of
>> the loop so you get pretty decent code.
>> 
>> More importantly in many cases, visitors allow in place algorithms.
>> Combined with view operators that limit visibility to part of a matrix,
>> and the inlining phenomenon mentioned above, this can have enormous
>> implications to performance.
>> 
>> A great case in point is the Mahout math library.  With no special efforts
>> taken and using the visitor style fairly ubiquitously, I can get about 2 G
>> flops from my laptop.  Using Atlas as a LINPAK implementation gives me
>> about 5 G flops.
>> 
>> I agree with the point that linear algebra operators should be used where
>> possible, but that just isn't feasible for lots of operations in real
>> applications.  Getting solid performance with simple code in those
>> applications is a real virtue.
>> 
>> On Sat, Dec 29, 2012 at 3:22 PM, Konstantin Berlin <kb...@gmail.com>wrote:
>> 
>>> While visitor pattern is a good abstraction, I think it would make for
>>> terrible linear algebra performance. All operations should be based on
>>> basic vector operations, which internally can take advantage of sequential
>>> memory access. For large problems it makes a difference. The visitor
>>> pattern is a nice add on, but it should not be the engine driving the
>>> package under the hood, in my opinion.
>>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

That's a good point about the compiler. I never tested the performance of visitors vs. sequential array access. I just don't want the vector operations to be tied to any particular implementation detail.

On Dec 29, 2012, at 6:30 PM, Ted Dunning <te...@gmail.com> wrote:

> Actually, the visitor pattern or variants thereof can produce very
> performant linear algebra implementations.  You can't usually get quite
> down to optimized BLAS performance, but you get pretty darned fast code.
> 
> The reason is that the visitor is typically a very simple class which is
> immediately inlined by the JIT.  Then it is subject to all of the normal
> optimizations exactly as if the code were written as a single concrete
> loop.  For many implementations, the bounds checks will be hoisted out of
> the loop so you get pretty decent code.
> 
> More importantly in many cases, visitors allow in place algorithms.
> Combined with view operators that limit visibility to part of a matrix,
> and the inlining phenomenon mentioned above, this can have enormous
> implications to performance.
> 
> A great case in point is the Mahout math library.  With no special efforts
> taken and using the visitor style fairly ubiquitously, I can get about 2 G
> flops from my laptop.  Using Atlas as a LINPAK implementation gives me
> about 5 G flops.
> 
> I agree with the point that linear algebra operators should be used where
> possible, but that just isn't feasible for lots of operations in real
> applications.  Getting solid performance with simple code in those
> applications is a real virtue.
> 
> On Sat, Dec 29, 2012 at 3:22 PM, Konstantin Berlin <kb...@gmail.com>wrote:
> 
>> While visitor pattern is a good abstraction, I think it would make for
>> terrible linear algebra performance. All operations should be based on
>> basic vector operations, which internally can take advantage of sequential
>> memory access. For large problems it makes a difference. The visitor
>> pattern is a nice add on, but it should not be the engine driving the
>> package under the hood, in my opinion.
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

Actually, the visitor pattern or variants thereof can produce very
performant linear algebra implementations.  You can't usually get quite
down to optimized BLAS performance, but you get pretty darned fast code.

The reason is that the visitor is typically a very simple class which is
immediately inlined by the JIT.  Then it is subject to all of the normal
optimizations exactly as if the code were written as a single concrete
loop.  For many implementations, the bounds checks will be hoisted out of
the loop so you get pretty decent code.

More importantly in many cases, visitors allow in place algorithms.
 Combined with view operators that limit visibility to part of a matrix,
and the inlining phenomenon mentioned above, this can have enormous
implications to performance.

A great case in point is the Mahout math library.  With no special efforts
taken and using the visitor style fairly ubiquitously, I can get about 2 G
flops from my laptop.  Using Atlas as a LINPAK implementation gives me
about 5 G flops.

I agree with the point that linear algebra operators should be used where
possible, but that just isn't feasible for lots of operations in real
applications.  Getting solid performance with simple code in those
applications is a real virtue.

On Sat, Dec 29, 2012 at 3:22 PM, Konstantin Berlin <kb...@gmail.com>wrote:

> While visitor pattern is a good abstraction, I think it would make for
> terrible linear algebra performance. All operations should be based on
> basic vector operations, which internally can take advantage of sequential
> memory access. For large problems it makes a difference. The visitor
> pattern is a nice add on, but it should not be the engine driving the
> package under the hood, in my opinion.
>

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

>> 
>> I think this issue can be overcome by proper OO design, and hacked on to
>> the current interface for now.
>> 
>> We can create an "InPlace" interface that all linear operators (including
>> RealMatrix etc) would implement. For example in a hypothetical
>> InPlaceMatrix class
>> 
>> function InPlaceVector mult(InPlaceVector x), depending on implementation
>> details, would either return back overwritten x in the return value, or a
>> new value for multiplication leaving x in tact. This would allow the user
>> to implement their more efficient version, while also allowing the usage of
>> current linear algebra package if they don't care about performance. The
>> idea is that the optimization classes would rely only on the InPlace
>> interface for all their tasks.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
>> 
> I'm not sure about that. If I understand correctly, this would mean that
> the _user_ has to provide the solver with a so-called InPlaceVector. I
> don't think it should be the responsibility of the user to chose whether or
> not a solver performs in place operations. As I said previously visitors
> are a good way to do that without cluttering the interface. For example, we
> could define a Multiply visitor, which scales all components of a vector. A
> nice idiom would be to have a factory method Multiply.by(double) instead of
> a constructor new Multiply(double). This way, scaling a vector in place
> would read
> v.walkInDefaultOrder(Multiply.by(double))
> 
> What do you think?
> Sébastien

The user would not have to do anything, since any RealMatrix implementation would implement the InPlace interface. The user is free to just pass in a standard RealMatrix class, it will just (temporarily) not actually perform an in place operation, so it would waste memory, but otherwise would provide correct behavior. The choice of InPlace operation (and any other required operation) is controlled by the implementation of the optimization solver, not the user. I think design wise it's better, since the optimization function does not require anything more basic than some class that implements a few vector operations. It does not need to rely on something as heavy as a RealMatrix, or as limiting as a double[][], and this would allow for the user to implement their own version, if they have some large problem that they want to optimize.

While visitor pattern is a good abstraction, I think it would make for terrible linear algebra performance. All operations should be based on basic vector operations, which internally can take advantage of sequential memory access. For large problems it makes a difference. The visitor pattern is a nice add on, but it should not be the engine driving the package under the hood, in my opinion.

-Konstantin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Sébastien Brisard <se...@m4x.org>.

Hello,


2012/12/28 Konstantin Berlin <kb...@gmail.com>

> >
> > That's what I was going to write. At the moment, the current
> implementation
> > for sparse matrices and vector is deprecated, for lack of convincing
> fixes
> > for the bugs which have been found. These bugs are mainly related to the
> > fact that zero --the unstored value-- is signed, and the sign is lost in
> > sparse implementations. This might be considered as unimportant, but
> leads
> > to the fact that wa cannot achieve the same level of correctness in our
> > sparse impl as in our dense impls.
> >
> > As for data copying, I strongly agree, and it is not really related to
> the
> > implementation of sparse vectors/matrices. For example, our
> implementation
> > of the conjugate gradient only requires a linear operator --not a
> matrix--,
> > which can be as sparse as you want. The problem is that all
> implementations
> > of iterative linear solvers use basic vector operations (add, sub,
> etc...),
> > which cannot be carried out in-place in the current abstract class
> > RealVector. This leads to unnecessary memory allocations for very large
> > problems. I am currently facing this issue. My matrix is not a problem
> > (it's actually a so-called "matrix-free" problem), and the vectors are
> > dense. The only problem is the demand in RAM. We should really give a
> > thought to in-place linear operations, without cluttering too much the
> API!
> > A possible way would be to use visitors.
> >
> > Best regards,
> > Sébastien
> >
>
> I think this issue can be overcome by proper OO design, and hacked on to
> the current interface for now.
>
> We can create an "InPlace" interface that all linear operators (including
> RealMatrix etc) would implement. For example in a hypothetical
> InPlaceMatrix class
>
> function InPlaceVector mult(InPlaceVector x), depending on implementation
> details, would either return back overwritten x in the return value, or a
> new value for multiplication leaving x in tact. This would allow the user
> to implement their more efficient version, while also allowing the usage of
> current linear algebra package if they don't care about performance. The
> idea is that the optimization classes would rely only on the InPlace
> interface for all their tasks.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>
I'm not sure about that. If I understand correctly, this would mean that
the _user_ has to provide the solver with a so-called InPlaceVector. I
don't think it should be the responsibility of the user to chose whether or
not a solver performs in place operations. As I said previously visitors
are a good way to do that without cluttering the interface. For example, we
could define a Multiply visitor, which scales all components of a vector. A
nice idiom would be to have a factory method Multiply.by(double) instead of
a constructor new Multiply(double). This way, scaling a vector in place
would read
v.walkInDefaultOrder(Multiply.by(double))

What do you think?
Sébastien

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

> 
> That's what I was going to write. At the moment, the current implementation
> for sparse matrices and vector is deprecated, for lack of convincing fixes
> for the bugs which have been found. These bugs are mainly related to the
> fact that zero --the unstored value-- is signed, and the sign is lost in
> sparse implementations. This might be considered as unimportant, but leads
> to the fact that wa cannot achieve the same level of correctness in our
> sparse impl as in our dense impls.
> 
> As for data copying, I strongly agree, and it is not really related to the
> implementation of sparse vectors/matrices. For example, our implementation
> of the conjugate gradient only requires a linear operator --not a matrix--,
> which can be as sparse as you want. The problem is that all implementations
> of iterative linear solvers use basic vector operations (add, sub, etc...),
> which cannot be carried out in-place in the current abstract class
> RealVector. This leads to unnecessary memory allocations for very large
> problems. I am currently facing this issue. My matrix is not a problem
> (it's actually a so-called "matrix-free" problem), and the vectors are
> dense. The only problem is the demand in RAM. We should really give a
> thought to in-place linear operations, without cluttering too much the API!
> A possible way would be to use visitors.
> 
> Best regards,
> Sébastien
> 

I think this issue can be overcome by proper OO design, and hacked on to the current interface for now.

We can create an "InPlace" interface that all linear operators (including RealMatrix etc) would implement. For example in a hypothetical InPlaceMatrix class

function InPlaceVector mult(InPlaceVector x), depending on implementation details, would either return back overwritten x in the return value, or a new value for multiplication leaving x in tact. This would allow the user to implement their more efficient version, while also allowing the usage of current linear algebra package if they don't care about performance. The idea is that the optimization classes would rely only on the InPlace interface for all their tasks.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Sébastien Brisard <se...@m4x.org>.

Hello,


2012/12/28 Luc Maisonobe <Lu...@free.fr>

> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
> > Hi,
> >
> > I can understand Dimitri's frustration, it seems the optimization
> > framework gets worse with every iteration. However, we should
> > probably look forward and think about how to design it properly
> > instead.
> >
> > Several times I brought out some problems and ideas about the package
> > and it seems the only person who has an opinion is Gilles.
>
> Several people contributed to the thread (see
> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
> pointed out in one of the message, we lack an optimization expert. I
> sincerely would not want my opinion to be taken too seriously on this,
> so I only expressed what I could and did not decide anything by myself
> (only proposed to remove the wrong binding with DerivativeStructure,
> which has since been done).
>
> >
> > I will list what I consider to be major problems
>
> > 1) The OO design is bad, too much inheritance which could be handled
> > better by interfaces, the structure has no relation to the actual way
> > parts of optimizers can be mixed and matched. Input functions should
> > also have clear inheritance structure and should return both the
> > value, gradient, hessian in one function call.
>
> I strongly agree with Konstantin here. Abstract classes allow to add
> methods without breaking compatibility (which is good), but they also
> have some drawbacks (we have seen one drawback with the parameterized
> classes a few months ago, and this huge hierarchy is another one). So
> there is no silver bullet and we keep trying to find the good balance.
> As far as I am concerned, I would prefer we get fewer abstract classes,
> we remove some intermediate level (I don't know which ones), and we use
> more delegation than inheritance.
>
> > 2) Incorrect handling of constraints. There are only something like 5
> > possible constraints possible in optimization, with each
> > implementation of the solver handling some but not all. There is no
> > need to this runtime approach, which creates incredible amount of
> > confusion. All the possible constraints should be explicitly given in
> > the parameters to a function call, there are only 5. In addition,
> > constraints should be pre-processed a priori! So they should be an
> > input to the constructor not to the optimization function call.
>
> Our implementation for constraints is really limited (once again, scarce
> resources). What are the 5 types you consider? Simple/double bounds on
> parameters, linear/non-linear bounds and equality?
>
> > 3) Linear algebra package should be used as an interface and
> > internally to increase performance for larger datasets. Data copying
> > should be avoided as much as possible.
>
> Yes, but this would require solving another sub-problem first: having a
> decent sparse linear algebra implementation, which we also lack.


That's what I was going to write. At the moment, the current implementation
for sparse matrices and vector is deprecated, for lack of convincing fixes
for the bugs which have been found. These bugs are mainly related to the
fact that zero --the unstored value-- is signed, and the sign is lost in
sparse implementations. This might be considered as unimportant, but leads
to the fact that wa cannot achieve the same level of correctness in our
sparse impl as in our dense impls.

As for data copying, I strongly agree, and it is not really related to the
implementation of sparse vectors/matrices. For example, our implementation
of the conjugate gradient only requires a linear operator --not a matrix--,
which can be as sparse as you want. The problem is that all implementations
of iterative linear solvers use basic vector operations (add, sub, etc...),
which cannot be carried out in-place in the current abstract class
RealVector. This leads to unnecessary memory allocations for very large
problems. I am currently facing this issue. My matrix is not a problem
(it's actually a so-called "matrix-free" problem), and the vectors are
dense. The only problem is the demand in RAM. We should really give a
thought to in-place linear operations, without cluttering too much the API!
A possible way would be to use visitors.

Best regards,
Sébastien

Our
> implementation for full matrices was also in a sorry state prior to 3.0,
> but now fortunately this has improved at least for systems up to a few
> thousands rows and columns (so at least we do make progress on some
> points).
>
> > 4) Testing should be done on larger problems.
>
> Yes. I guess there are some general well known problems for that, so we
> should get a few of them and implement them. We did implement a number
> of tests from Minpack, but they focused on difficult cases rather than
> on problem size. I think optimization has a good testing coverage, but
> clearly large size problems is a needed addition.
>
> >
> > I know the response is that I am free to go implemented, but I think
> > we should at least agree on design principles instead of pushing
> > through your own ideas because the other person is too busy. The only
> > discussion we ever had on this was between me and Gilles, everyone
> > else disappeared.
>
> Well, we tried to keep up as our skills allowed, and we were also
> concerned with 3.1 being released at the same time.
>
> We have more time now than we had a few weeks ago. This is an
> opportunity to restart the discussion. We can refrain from pushing a new
> release (despite I would like this bug fix to be released officially)
> and take some time to think calmly. We could also push 3.2 with only the
> fix and without any revamp and start thinking about 4.0 with a redesign
> of these two main area: optimization and sparse linear algebra.
>
> If you could contribute to this discussion understanding we are not
> experts of this field and we cannot do it by ourselves, it would be great.
>
> best regards,
> Luc
>
> >
> > Thanks, Konstantin
> >
> > On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
> > wrote:
> >
> >> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
> >>> Luc,
> >>>
> >>>> So in order to make sure I understand your point, you would be
> >>>> OK if I deprecate the non-diagonal weights, in which case users
> >>>> needing this would have to implement it themselves by
> >>>> premultiplication (as both you and Konstantin seem to
> >>>> propose)?
> >>>
> >>> Yes, exactly.
> >>>
> >>>> Sure, but for the record the feature was also a last minute
> >>>> change. This was discussed on the list, and the final decision
> >>>> was to add this feature despite the release was close. No
> >>>> wonder we failed to test it thoroughsly.
> >>>
> >>> Last minute?  I have been discussing this with Gilles for
> >>> several months.
> >>
> >> Relevant project discussion happens *on this list*
> >>>
> >>>> We don't expect our releases to be perfect. We do our best,
> >>>> with the resources we have.
> >>>
> >>> I perfectly understand this but focusing those resources less on
> >>> rules and more on real cases might help.
> >>
> >> As stated before, you are more than welcome to *become* one of
> >> these resources.
> >>
> >> Phil
> >>>
> >>> Regards, Dim.
> >>>
> ----------------------------------------------------------------------------
> >>>
> >>>
> >>>
> Dimitri Pourbaix                         *      Don't worry, be happy
> >>> Institut d'Astronomie et d'Astrophysique *         and CARPE
> >>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
> >>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
> >>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
> >>> Bruxelles                        *        NAC: HBZSC RG2Z6
> >>> http://sb9.astro.ulb.ac.be/~pourbaix     *
> >>> mailto:pourbaix@astro.ulb.ac.be
> >>>
> >>> ---------------------------------------------------------------------
> >>>
> >>>
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >>
> >>
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> >
> >
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [math] major problem with new released version 3.1

Posted by Hassan Siddiqui <hs...@sciops.esa.int>.

> > What about 3.1.1 with just the fix for this, then possibly 3.2 or
> > direct to 4.0.
> 
> As you want. I don't like adding too many sub-release numbers, just
> as
> if someone fears jumping to next version. I remember some Sun
> products
> and Perl versions that end up with something like 5 release digits.
> Well, I don't really care, so it can be 3.1.1 if people prefer that.

+1 on a 3.1.1 patch with that fix. 

I'm working on the same project as Gilles and Dim and assuming this is the only major problem found, it would be beneficial at least in my opinion to have release close to 3.1 that isolates that fix, before any major revamp.

If other problems are found, this proposal fails, and we might as well wait for a 3.2 or 4.0 .

-Hassan.

----- Original Message -----
> From: "Luc Maisonobe" <Lu...@free.fr>
> To: "Commons Developers List" <de...@commons.apache.org>
> Sent: Friday, December 28, 2012 7:35:15 PM
> Subject: Re: [math] major problem with new released version 3.1
> 
> Le 28/12/2012 19:11, Phil Steitz a écrit :
> > On 12/28/12 9:58 AM, Luc Maisonobe wrote:
> >> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
> >>> Hi,
> >>>
> >>> I can understand Dimitri's frustration, it seems the optimization
> >>> framework gets worse with every iteration. However, we should
> >>> probably look forward and think about how to design it properly
> >>> instead.
> >>>
> >>> Several times I brought out some problems and ideas about the
> >>> package
> >>> and it seems the only person who has an opinion is Gilles.
> >> Several people contributed to the thread (see
> >> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as
> >> Gilles
> >> pointed out in one of the message, we lack an optimization expert.
> >> I
> >> sincerely would not want my opinion to be taken too seriously on
> >> this,
> >> so I only expressed what I could and did not decide anything by
> >> myself
> >> (only proposed to remove the wrong binding with
> >> DerivativeStructure,
> >> which has since been done).
> >>
> >>> I will list what I consider to be major problems
> >>> 1) The OO design is bad, too much inheritance which could be
> >>> handled
> >>> better by interfaces, the structure has no relation to the actual
> >>> way
> >>> parts of optimizers can be mixed and matched. Input functions
> >>> should
> >>> also have clear inheritance structure and should return both the
> >>> value, gradient, hessian in one function call.
> >> I strongly agree with Konstantin here. Abstract classes allow to
> >> add
> >> methods without breaking compatibility (which is good), but they
> >> also
> >> have some drawbacks (we have seen one drawback with the
> >> parameterized
> >> classes a few months ago, and this huge hierarchy is another one).
> >> So
> >> there is no silver bullet and we keep trying to find the good
> >> balance.
> >> As far as I am concerned, I would prefer we get fewer abstract
> >> classes,
> >> we remove some intermediate level (I don't know which ones), and
> >> we use
> >> more delegation than inheritance.
> >>
> >>> 2) Incorrect handling of constraints. There are only something
> >>> like 5
> >>> possible constraints possible in optimization, with each
> >>> implementation of the solver handling some but not all. There is
> >>> no
> >>> need to this runtime approach, which creates incredible amount of
> >>> confusion. All the possible constraints should be explicitly
> >>> given in
> >>> the parameters to a function call, there are only 5. In addition,
> >>> constraints should be pre-processed a priori! So they should be
> >>> an
> >>> input to the constructor not to the optimization function call.
> >> Our implementation for constraints is really limited (once again,
> >> scarce
> >> resources). What are the 5 types you consider? Simple/double
> >> bounds on
> >> parameters, linear/non-linear bounds and equality?
> >>
> >>> 3) Linear algebra package should be used as an interface and
> >>> internally to increase performance for larger datasets. Data
> >>> copying
> >>> should be avoided as much as possible.
> >> Yes, but this would require solving another sub-problem first:
> >> having a
> >> decent sparse linear algebra implementation, which we also lack.
> >> Our
> >> implementation for full matrices was also in a sorry state prior
> >> to 3.0,
> >> but now fortunately this has improved at least for systems up to a
> >> few
> >> thousands rows and columns (so at least we do make progress on
> >> some points).
> >>
> >>> 4) Testing should be done on larger problems.
> >> Yes. I guess there are some general well known problems for that,
> >> so we
> >> should get a few of them and implement them. We did implement a
> >> number
> >> of tests from Minpack, but they focused on difficult cases rather
> >> than
> >> on problem size. I think optimization has a good testing coverage,
> >> but
> >> clearly large size problems is a needed addition.
> >>
> >>> I know the response is that I am free to go implemented, but I
> >>> think
> >>> we should at least agree on design principles instead of pushing
> >>> through your own ideas because the other person is too busy. The
> >>> only
> >>> discussion we ever had on this was between me and Gilles,
> >>> everyone
> >>> else disappeared.
> >> Well, we tried to keep up as our skills allowed, and we were also
> >> concerned with 3.1 being released at the same time.
> >>
> >> We have more time now than we had a few weeks ago. This is an
> >> opportunity to restart the discussion. We can refrain from pushing
> >> a new
> >> release (despite I would like this bug fix to be released
> >> officially)
> >> and take some time to think calmly. We could also push 3.2 with
> >> only the
> >> fix
> > 
> > What about 3.1.1 with just the fix for this, then possibly 3.2 or
> > direct to 4.0.
> 
> As you want. I don't like adding too many sub-release numbers, just
> as
> if someone fears jumping to next version. I remember some Sun
> products
> and Perl versions that end up with something like 5 release digits.
> Well, I don't really care, so it can be 3.1.1 if people prefer that.
> 
> Luc
> 
> > 
> > Phil
> >>  and without any revamp and start thinking about 4.0 with a
> >>  redesign
> >> of these two main area: optimization and sparse linear algebra.
> >>
> >> If you could contribute to this discussion understanding we are
> >> not
> >> experts of this field and we cannot do it by ourselves, it would
> >> be great.
> >>
> >> best regards,
> >> Luc
> >>
> >>> Thanks, Konstantin
> >>>
> >>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
> >>> wrote:
> >>>
> >>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
> >>>>> Luc,
> >>>>>
> >>>>>> So in order to make sure I understand your point, you would be
> >>>>>> OK if I deprecate the non-diagonal weights, in which case
> >>>>>> users
> >>>>>> needing this would have to implement it themselves by
> >>>>>> premultiplication (as both you and Konstantin seem to
> >>>>>> propose)?
> >>>>> Yes, exactly.
> >>>>>
> >>>>>> Sure, but for the record the feature was also a last minute
> >>>>>> change. This was discussed on the list, and the final decision
> >>>>>> was to add this feature despite the release was close. No
> >>>>>> wonder we failed to test it thoroughsly.
> >>>>> Last minute?  I have been discussing this with Gilles for
> >>>>> several months.
> >>>> Relevant project discussion happens *on this list*
> >>>>>> We don't expect our releases to be perfect. We do our best,
> >>>>>> with the resources we have.
> >>>>> I perfectly understand this but focusing those resources less
> >>>>> on
> >>>>> rules and more on real cases might help.
> >>>> As stated before, you are more than welcome to *become* one of
> >>>> these resources.
> >>>>
> >>>> Phil
> >>>>> Regards, Dim.
> >>>>> ----------------------------------------------------------------------------
> >>>>>
> >>>>>
> >>>>>
> >> Dimitri Pourbaix                         *      Don't worry, be
> >> happy
> >>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
> >>>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite
> >>>>> Libre
> >>>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard
> >>>>> du
> >>>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
> >>>>> Bruxelles                        *        NAC: HBZSC RG2Z6
> >>>>> http://sb9.astro.ulb.ac.be/~pourbaix     *
> >>>>> mailto:pourbaix@astro.ulb.ac.be
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>>
> >>>>>
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>>
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>>
> >>>>
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>>
> >>>
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >>
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 

This message and any attachments are intended for the use of the addressee or addressees only. The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content is not permitted. If you received this message in error, please notify the sender and delete it from your system. Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

Hello.

[As Phil suggested, please stop extending this thread; and rather open new
ones on specific points in order to focus the discussion(s).]

> > Yes, abd I agree with that. However, I found the javadoc to be
> > ambiguous. It says "The following data will be looked for:" followed by
> > a list. There is no distinction between optional and required
> > parameters. As an example, simple bounds are optional whereas initial
> > guess or weight are required, but there is nothing to tell it to user.
> > So in this case, either we should provide proper exception or proper
> > documentation. I am OK with both.
> >
> >> It is however straightforward to add a "checkParameters()" method that would
> >> raise more specific exceptions. [Although that would contradict the
> >> conclusion of the previous discussion about NPE in CM. And restart it,
> >> without even getting a chance to go forward with what had been decided!]
> > As long as we identify the parameters that are optional (and hence user
> > can deduce the other one are mandatory and will raise an NPE), this
> > would be fine. I don't ask to restart this tiring discussion, just make
> > sure users have the proper way to understand why they get an NPE when
> > the forget weight and why they don't get one when the forget simple bounds.
> 
> +1 - can we just make it clear in the javadoc what is optional, what
> is required and add tests to illustrate?  I assume it is intractable
> to just add full signatures for the activations that "work?"

In that context, "optional" actually means that "optimize" can be called a
_second_ time (or more) without specifying arguments that were given in a
previous call. I'll add the clarification in the Javadoc.
[I'm aware that this is not a common behaviour in CM, or in Java for that
matter, or in the usual way to code in a strongly typed language, but it
greatly simplifies the code when API tries to represent (too?) many things
at the same time. This is (more) commonly used in other languages e.g. to
allow default values for many parameters. Also, I don't claim that we should
stick with that, but I was led to this design by the goal to stay relatively
close to the existing classes and features while removing many of the
confusing multiplication of interfaces and classes. More on that below...]

No argument is optional in the sense that if the actual optimizer
implementation will be using the argument, it must of course be provided by
the caller.
For example, if "CMAESOptimizer" is used, then the "SimpleBounds" _must_ be
provided (or an NPE will be raised when "CMAESOptimizer" tries to retrieve
th bound values). If "PowellOptimizer" is used, specifying "SimpleBounds"
will have no effect (because "PowellOptimizer" does not call the accessor to
the bound values).

> > Also weight should really be optional and have a fallback to identity
> > matrix, but this is another story.
> 
> +1 - and while I have not yet really penetrated this code,

Given what I wrote above, this should be -1. I think that it is simply
dangerous to have defaults (in the sense that users are not made aware that
an algorithms _need_ and _use_ user input) on low-level functions. This can
be misleading. Anyways, I hope that this point will not need further
discussion since we all seem to agree that the "weight" feature should be
dropped ASAP.

> I have
> sympathy for Konstantin's views that it might be better to actually
> separate the impls.  I don't want to open a can of worms forcing us
> to add yet more complexity by merging impls; but we implemented a
> special case of least squares in the stat.regression, where we
> separated GLS from OLS, allowing the OLS code to be simpler. 
> Separation turned out to make things easier both for users and us
> there.  The setup is not perfect, but it is documentable and
> maintainable.  Separation also had the benefit that bugs and lack of
> specification clarity in early versions of GLS did not impact OLS.

As described above in the little rationale for why we are where we are now
in the design of the "optim" package, the crux of the problem was stressed
by Dimitri; and if we were to indeed separate "optimization" (minimizing a
cost) from "fitting" (matching a set of model functions), it would unwind a
tangle of partly conflicting functionalities, that led to e.g. the current
need to have generics classes depending on the "optimize" method's return
value. [This type either contains the (scalar) cost, or the (vector) model
function values.]

Hence I propose that we first change what is actually broken (the basics fo
the problems' description) before focusing on supposedly "bad OO
implementation" or implementation details.

Sometimes we have to start from a clean slate, instead of piling up
workarounds.
So, I'm requesting advice on a _set of assumptions_ on which to base a new
design of both the "optim" and "fitting" packages.
[IIUC, everything currently under "optim.nonlinear.vector" should go to
"fitting". Please confirm (in another ML thread).]

> Can we agree at this point that Gilles' commit in r1426758 resolves
> the issue that started this thread (MATH-924)?

I agree. :-)

> I am also +1 on the following:
> 
> 0) continuing in the direction started in r1426758, adding
> specialized matrix impls

+1
That would also enable to make progress in the cleaning up of
"BOBYQAOptimizer" (that uses explicit loops actually representing
operations on e.g. symmetric matrices).

> 1) "undeprecating" the sparse impls

+0
Simply undeprecating is not enough. As we discussed, several times,
the current matrix interface is too complicated/verbose (cf. summary
assembled by Sébastrien on JIRA) so that existing implementations
are sometimes inefficient. Creating new specialized implementations is
more cumbersome than necessary.

> 2) moving to in-place vector/matrix operations within
> implementations as much as possible

-1
As I was first pointed at here, on this mailing list, mutable classes are
not necessarily more efficient; rather they were demonstrated to be less so
in a number of cases. Unless there is at least one realistic benchmark that
indicates a huge efficiency gain, we should not shoot ourselves in the foot
by having to maintain classes that are, by definition, more fragile.
[E.g. instead of shiny words like "GPU", it would be rather more intructive
if we could _see_ code making such calls...]

> Item 2) creates some tension with the objective of having the impls
> reflect the mathematics (Gilles good point about "self-documenting
> code"), but there may be clever ways to maintain readability / good
> OO structure while achieving large efficiency gains using the
> visitor pattern or Konstantin's suggestion of an InPlace interface.
> Lets look at the Mahout classes for some inspiration there.  IIUC
> the proposals, this also creates tension with the objective to favor
> immutability.

Cf. my previous point.
And also: "First make a code work; then make it faster, if really
necessary."

Then, if we have _actual_ use cases, we should lay out what options we have
to achieve the "faster code" goal.
As I pointed out previously, some of the CM functionalities could certainly
take advantage of the concurrency features in Java 6 and 7. Thoses tools are
there, maintained and improved by a community probably several order of
magnitudes larger than CM; it would be insane to embark on reinventing those
wheels because we would want to take advantage of multi-threading while
maintaining Java 5 compatibility.

> Regarding 1), any suggestions on which issues to work on /
> approaches to try first?

What about starting from scratch? Define a bare-bones interface, and develop
specialized implementations in parallel, adding benchmark test cases for all
the "critical" methods. [The situation is not that bad, since
"Array2DRowRealMatrix" (for small to midsize number of entries), and
"BlockRealMatrix (for larger numbers of entries) work pretty decently, I
think. However it would be worth separating "basic" (aka efficiency
critical) operations from higher level ones (to be implemented in other
classes (using inheritance or composition/delegation).]


I look forward to reading from you in other threads, ;-)
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/29/12 4:16 AM, Luc Maisonobe wrote:
> Le 29/12/2012 12:34, Gilles Sadowski a écrit :
>> On Sat, Dec 29, 2012 at 10:43:11AM +0100, Luc Maisonobe wrote:
>>> Le 29/12/2012 10:08, Luc Maisonobe a écrit :
>>>> Le 29/12/2012 04:09, Gilles Sadowski a écrit :
>>>>> Hi.
>>>> Hi Gilles,
>>>>
>>>>>> [...]
>>>>>>> third is just bug-fix.  Does the fix for the issue that started this
>>>>>>> thread change the API?  If so, we would need to cut 3.2 in any case.
>>>>> The current fix does change the usage (one needs to call optimize with an
>>>>> argument of a different type). IIUC, it also silently removes the handling
>>>>> of uncorrelated observations.
>>>>>
>>>>>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>>>>>> and committed the fix as of r1426616.
>>>>>>
>>>>>> Could someone please review what I have done?
>>>>> I don't like the fix...
>>>> Thanks for reviewing.
>>>>
>>>>> Handling weighted observations must take correlations into account, i.e. use
>>>>> a _matrix_.
>>>>> There is the _practical_ problem of memory. Solving it correctly is by using
>>>>> a sparse implementation (and this is actually an implementation _detail_).
>>>> Yes.
>>>>
>>>>> If we _need_ such an implementation to solve the practical problem, I
>>>>> strongly suggest that we focus on creating it (or fixing what CM already
>>>>> has, or accept that some inconsistency will be present), rather than
>>>>> reducing the code applicability (i.e. allowing only uncorrelated data).
>>>>> If the observations are not correlated, the matrix is a diagonal matrix, not
>>>>> a vector.
>>>> It's fine with me. I simply thought it wouldn't be that easy. You proved
>>>> me wrong.
>>>>
>>>>> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
>>>>> which all would go some way to solving several practical problems of the same
>>>>> kind as the current one without sacrificing generality.
>>>> Yes, we have known that for years.
>>>>
>>>>> Now, and several years ago, it was noticed that CM does not _have_ to
>>>>> provide the "weights" feature because users can handle this before they
>>>>> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
>>>>> IMO, the other valid option is thus to have a simpler version of the
>>>>> algorithm, but still a correct one.
>>>>> This would also have the advantage that we won't have the urgent need to
>>>>> keep the sparse matrix implementation.
>>>>> [Then, if at some point we include helper code to handle weights
>>>>> (_including_ correlations), we should also do it in a "preprocessing" step,
>>>>> without touching the optimization algorithms.]
>>>> So what do you suggest: keep the current support (with proper handling
>>>> as you did) or drop it?
>>>>
>>>> Since several people asked for removing it (Dimitri, Konstantin and now
>>>> you), we can do that. Unitl now, this feature was a convenience that was
>>>> really useful for some cases, and it was simple. There were some errors
>>>> in it and Dimitri solved them in 2010, but no other problems appeared
>>>> since them, so it made sense simply keeping it as is. Now we are hit by
>>>> a second problem, and it seems it opens a can of worm. Dropping it
>>>> completely as a not so useful convenience which is tricky to set up
>>>> right would be fair.
>>>>
>>>>>> I also think (but did not test it), that there may be a problem with
>>>>>> missing OptimizationData. If someone call the optimizer but forget to
>>>>>> set the weight (or the target, or some other mandatory parameters), then
>>>>>> I'm not sure we fail with an appropriate error. Looking for example at
>>>>>> the private checkParameters method in the MultivariateVectorOptimizer
>>>>>> abstract class, I guess we may have a NullPointer Exception if either
>>>>>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>>>>>> parseOptimizationData method. Could someone confirm this?
>>>>> Yes.
>>>>> And this (not checking for missing data) _could_ be considered a feature, as
>>>>> I stressed several times on this ML, and in the code documentation
>>>>> (eliciting zero constructive comment).
>>>>> We also _agreed_ that users not passing needed data will result in NPE.
>>>>> [I imagined that applications would check that valid and complete input is
>>>>> passed to the lower level "optimize(OptimizationData ...)" methods.]
>>>> Yes, abd I agree with that. However, I found the javadoc to be
>>>> ambiguous. It says "The following data will be looked for:" followed by
>>>> a list. There is no distinction between optional and required
>>>> parameters. As an example, simple bounds are optional whereas initial
>>>> guess or weight are required, but there is nothing to tell it to user.
>>>> So in this case, either we should provide proper exception or proper
>>>> documentation. I am OK with both.
>>>>
>>>>> It is however straightforward to add a "checkParameters()" method that would
>>>>> raise more specific exceptions. [Although that would contradict the
>>>>> conclusion of the previous discussion about NPE in CM. And restart it,
>>>>> without even getting a chance to go forward with what had been decided!]
>>>> As long as we identify the parameters that are optional (and hence user
>>>> can deduce the other one are mandatory and will raise an NPE), this
>>>> would be fine. I don't ask to restart this tiring discussion, just make
>>>> sure users have the proper way to understand why they get an NPE when
>>>> the forget weight and why they don't get one when the forget simple bounds.
>>>>
>>>> Also weight should really be optional and have a fallback to identity
>>>> matrix, but this is another story.
>>>>
>>>>>
>>>>> Hence, to summarize:
>>>>>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>>>>>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>>>>>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
>>>> +1
>>> I forgot to ask: do you want me to first revert my change before you
>>> commit yours
>> Yes, please.
> Done as of r1426751.
>
>>> or will you do both at the same time?
>> I wouldn't know how to do that.
>> What is the svn command to revert a list a files to their state in a given
>> revision?
> I don't know. I now use git-svn to connect to our subversion repository
> while using git commands locally.

You can use svn merge to do this
svn merge -r [current_version]:[previous_version] [path]
then look at diff and commit. See [1].

Phil

[1] *http://s.apache.org/NQ0*

>
> Luc
>
>> Thanks,
>> Gilles
>>
>>> Luc
>>>
>>>>>  * We must decide wether to deprecate the weights feature in that release
>>>>>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>>>>>    un-deprecate "OpenMapRealMatrix"[2]).
>>>> +1 to deprecate.
>>>>
>>>> best regards
>>>> Luc
>>>>
>>>>>
>>>>> Best regards,
>>>>> Gilles
>>>>>
>>>>> [1] The inconsistencies that led to the deprecation will have no bearing
>>>>>     on usage within the optimizers.
>>>>> [2] The latter option seems likely to entail a fair amount of work to
>>>>>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>>>>>     for some operations, e.g. "multiply").
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 29/12/2012 12:34, Gilles Sadowski a écrit :
> On Sat, Dec 29, 2012 at 10:43:11AM +0100, Luc Maisonobe wrote:
>> Le 29/12/2012 10:08, Luc Maisonobe a écrit :
>>> Le 29/12/2012 04:09, Gilles Sadowski a écrit :
>>>> Hi.
>>>
>>> Hi Gilles,
>>>
>>>>
>>>>> [...]
>>>>
>>>>>> third is just bug-fix.  Does the fix for the issue that started this
>>>>>> thread change the API?  If so, we would need to cut 3.2 in any case.
>>>>
>>>> The current fix does change the usage (one needs to call optimize with an
>>>> argument of a different type). IIUC, it also silently removes the handling
>>>> of uncorrelated observations.
>>>>
>>>>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>>>>> and committed the fix as of r1426616.
>>>>>
>>>>> Could someone please review what I have done?
>>>>
>>>> I don't like the fix...
>>>
>>> Thanks for reviewing.
>>>
>>>>
>>>> Handling weighted observations must take correlations into account, i.e. use
>>>> a _matrix_.
>>>> There is the _practical_ problem of memory. Solving it correctly is by using
>>>> a sparse implementation (and this is actually an implementation _detail_).
>>>
>>> Yes.
>>>
>>>> If we _need_ such an implementation to solve the practical problem, I
>>>> strongly suggest that we focus on creating it (or fixing what CM already
>>>> has, or accept that some inconsistency will be present), rather than
>>>> reducing the code applicability (i.e. allowing only uncorrelated data).
>>>> If the observations are not correlated, the matrix is a diagonal matrix, not
>>>> a vector.
>>>
>>> It's fine with me. I simply thought it wouldn't be that easy. You proved
>>> me wrong.
>>>
>>>> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
>>>> which all would go some way to solving several practical problems of the same
>>>> kind as the current one without sacrificing generality.
>>>
>>> Yes, we have known that for years.
>>>
>>>>
>>>> Now, and several years ago, it was noticed that CM does not _have_ to
>>>> provide the "weights" feature because users can handle this before they
>>>> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
>>>> IMO, the other valid option is thus to have a simpler version of the
>>>> algorithm, but still a correct one.
>>>> This would also have the advantage that we won't have the urgent need to
>>>> keep the sparse matrix implementation.
>>>> [Then, if at some point we include helper code to handle weights
>>>> (_including_ correlations), we should also do it in a "preprocessing" step,
>>>> without touching the optimization algorithms.]
>>>
>>> So what do you suggest: keep the current support (with proper handling
>>> as you did) or drop it?
>>>
>>> Since several people asked for removing it (Dimitri, Konstantin and now
>>> you), we can do that. Unitl now, this feature was a convenience that was
>>> really useful for some cases, and it was simple. There were some errors
>>> in it and Dimitri solved them in 2010, but no other problems appeared
>>> since them, so it made sense simply keeping it as is. Now we are hit by
>>> a second problem, and it seems it opens a can of worm. Dropping it
>>> completely as a not so useful convenience which is tricky to set up
>>> right would be fair.
>>>
>>>>
>>>>> I also think (but did not test it), that there may be a problem with
>>>>> missing OptimizationData. If someone call the optimizer but forget to
>>>>> set the weight (or the target, or some other mandatory parameters), then
>>>>> I'm not sure we fail with an appropriate error. Looking for example at
>>>>> the private checkParameters method in the MultivariateVectorOptimizer
>>>>> abstract class, I guess we may have a NullPointer Exception if either
>>>>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>>>>> parseOptimizationData method. Could someone confirm this?
>>>>
>>>> Yes.
>>>> And this (not checking for missing data) _could_ be considered a feature, as
>>>> I stressed several times on this ML, and in the code documentation
>>>> (eliciting zero constructive comment).
>>>> We also _agreed_ that users not passing needed data will result in NPE.
>>>> [I imagined that applications would check that valid and complete input is
>>>> passed to the lower level "optimize(OptimizationData ...)" methods.]
>>>
>>> Yes, abd I agree with that. However, I found the javadoc to be
>>> ambiguous. It says "The following data will be looked for:" followed by
>>> a list. There is no distinction between optional and required
>>> parameters. As an example, simple bounds are optional whereas initial
>>> guess or weight are required, but there is nothing to tell it to user.
>>> So in this case, either we should provide proper exception or proper
>>> documentation. I am OK with both.
>>>
>>>>
>>>> It is however straightforward to add a "checkParameters()" method that would
>>>> raise more specific exceptions. [Although that would contradict the
>>>> conclusion of the previous discussion about NPE in CM. And restart it,
>>>> without even getting a chance to go forward with what had been decided!]
>>>
>>> As long as we identify the parameters that are optional (and hence user
>>> can deduce the other one are mandatory and will raise an NPE), this
>>> would be fine. I don't ask to restart this tiring discussion, just make
>>> sure users have the proper way to understand why they get an NPE when
>>> the forget weight and why they don't get one when the forget simple bounds.
>>>
>>> Also weight should really be optional and have a fallback to identity
>>> matrix, but this is another story.
>>>
>>>>
>>>>
>>>> Hence, to summarize:
>>>>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>>>>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>>>>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
>>>
>>> +1
>>
>> I forgot to ask: do you want me to first revert my change before you
>> commit yours
> 
> Yes, please.

Done as of r1426751.

> 
>> or will you do both at the same time?
> 
> I wouldn't know how to do that.
> What is the svn command to revert a list a files to their state in a given
> revision?

I don't know. I now use git-svn to connect to our subversion repository
while using git commands locally.

Luc

> 
> Thanks,
> Gilles
> 
>>
>> Luc
>>
>>>
>>>>  * We must decide wether to deprecate the weights feature in that release
>>>>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>>>>    un-deprecate "OpenMapRealMatrix"[2]).
>>>
>>> +1 to deprecate.
>>>
>>> best regards
>>> Luc
>>>
>>>>
>>>>
>>>> Best regards,
>>>> Gilles
>>>>
>>>> [1] The inconsistencies that led to the deprecation will have no bearing
>>>>     on usage within the optimizers.
>>>> [2] The latter option seems likely to entail a fair amount of work to
>>>>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>>>>     for some operations, e.g. "multiply").
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

On Sat, Dec 29, 2012 at 10:43:11AM +0100, Luc Maisonobe wrote:
> Le 29/12/2012 10:08, Luc Maisonobe a écrit :
> > Le 29/12/2012 04:09, Gilles Sadowski a écrit :
> >> Hi.
> > 
> > Hi Gilles,
> > 
> >>
> >>> [...]
> >>
> >>>> third is just bug-fix.  Does the fix for the issue that started this
> >>>> thread change the API?  If so, we would need to cut 3.2 in any case.
> >>
> >> The current fix does change the usage (one needs to call optimize with an
> >> argument of a different type). IIUC, it also silently removes the handling
> >> of uncorrelated observations.
> >>
> >>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
> >>> and committed the fix as of r1426616.
> >>>
> >>> Could someone please review what I have done?
> >>
> >> I don't like the fix...
> > 
> > Thanks for reviewing.
> > 
> >>
> >> Handling weighted observations must take correlations into account, i.e. use
> >> a _matrix_.
> >> There is the _practical_ problem of memory. Solving it correctly is by using
> >> a sparse implementation (and this is actually an implementation _detail_).
> > 
> > Yes.
> > 
> >> If we _need_ such an implementation to solve the practical problem, I
> >> strongly suggest that we focus on creating it (or fixing what CM already
> >> has, or accept that some inconsistency will be present), rather than
> >> reducing the code applicability (i.e. allowing only uncorrelated data).
> >> If the observations are not correlated, the matrix is a diagonal matrix, not
> >> a vector.
> > 
> > It's fine with me. I simply thought it wouldn't be that easy. You proved
> > me wrong.
> > 
> >> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
> >> which all would go some way to solving several practical problems of the same
> >> kind as the current one without sacrificing generality.
> > 
> > Yes, we have known that for years.
> > 
> >>
> >> Now, and several years ago, it was noticed that CM does not _have_ to
> >> provide the "weights" feature because users can handle this before they
> >> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
> >> IMO, the other valid option is thus to have a simpler version of the
> >> algorithm, but still a correct one.
> >> This would also have the advantage that we won't have the urgent need to
> >> keep the sparse matrix implementation.
> >> [Then, if at some point we include helper code to handle weights
> >> (_including_ correlations), we should also do it in a "preprocessing" step,
> >> without touching the optimization algorithms.]
> > 
> > So what do you suggest: keep the current support (with proper handling
> > as you did) or drop it?
> > 
> > Since several people asked for removing it (Dimitri, Konstantin and now
> > you), we can do that. Unitl now, this feature was a convenience that was
> > really useful for some cases, and it was simple. There were some errors
> > in it and Dimitri solved them in 2010, but no other problems appeared
> > since them, so it made sense simply keeping it as is. Now we are hit by
> > a second problem, and it seems it opens a can of worm. Dropping it
> > completely as a not so useful convenience which is tricky to set up
> > right would be fair.
> > 
> >>
> >>> I also think (but did not test it), that there may be a problem with
> >>> missing OptimizationData. If someone call the optimizer but forget to
> >>> set the weight (or the target, or some other mandatory parameters), then
> >>> I'm not sure we fail with an appropriate error. Looking for example at
> >>> the private checkParameters method in the MultivariateVectorOptimizer
> >>> abstract class, I guess we may have a NullPointer Exception if either
> >>> Target or Weight/NonCorrelatedWeight has not been parsed in the
> >>> parseOptimizationData method. Could someone confirm this?
> >>
> >> Yes.
> >> And this (not checking for missing data) _could_ be considered a feature, as
> >> I stressed several times on this ML, and in the code documentation
> >> (eliciting zero constructive comment).
> >> We also _agreed_ that users not passing needed data will result in NPE.
> >> [I imagined that applications would check that valid and complete input is
> >> passed to the lower level "optimize(OptimizationData ...)" methods.]
> > 
> > Yes, abd I agree with that. However, I found the javadoc to be
> > ambiguous. It says "The following data will be looked for:" followed by
> > a list. There is no distinction between optional and required
> > parameters. As an example, simple bounds are optional whereas initial
> > guess or weight are required, but there is nothing to tell it to user.
> > So in this case, either we should provide proper exception or proper
> > documentation. I am OK with both.
> > 
> >>
> >> It is however straightforward to add a "checkParameters()" method that would
> >> raise more specific exceptions. [Although that would contradict the
> >> conclusion of the previous discussion about NPE in CM. And restart it,
> >> without even getting a chance to go forward with what had been decided!]
> > 
> > As long as we identify the parameters that are optional (and hence user
> > can deduce the other one are mandatory and will raise an NPE), this
> > would be fine. I don't ask to restart this tiring discussion, just make
> > sure users have the proper way to understand why they get an NPE when
> > the forget weight and why they don't get one when the forget simple bounds.
> > 
> > Also weight should really be optional and have a fallback to identity
> > matrix, but this is another story.
> > 
> >>
> >>
> >> Hence, to summarize:
> >>  * The fix, in a 3.2 release, should be to replace the matrix implementation
> >>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
> >>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
> > 
> > +1
> 
> I forgot to ask: do you want me to first revert my change before you
> commit yours

Yes, please.

> or will you do both at the same time?

I wouldn't know how to do that.
What is the svn command to revert a list a files to their state in a given
revision?

Thanks,
Gilles

> 
> Luc
> 
> > 
> >>  * We must decide wether to deprecate the weights feature in that release
> >>    (and thus remove it in 4.0) or to keep it in its general form (and thus
> >>    un-deprecate "OpenMapRealMatrix"[2]).
> > 
> > +1 to deprecate.
> > 
> > best regards
> > Luc
> > 
> >>
> >>
> >> Best regards,
> >> Gilles
> >>
> >> [1] The inconsistencies that led to the deprecation will have no bearing
> >>     on usage within the optimizers.
> >> [2] The latter option seems likely to entail a fair amount of work to
> >>     improve the performance of "OpenMapRealMatrix" (which is quite poor
> >>     for some operations, e.g. "multiply").
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >>
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

Hi.

> > [...]
> >
> > [1] The inconsistencies that led to the deprecation will have no bearing
> >     on usage within the optimizers.
> > [2] The latter option seems likely to entail a fair amount of work to
> >     improve the performance of "OpenMapRealMatrix" (which is quite poor
> >     for some operations, e.g. "multiply").
> >
> > I have never been happy with deprecating the sparse implementations of
> vectors and matrices. I also think we should "undeprecate" them and work on
> them, with some help!
> Regarding [1], my choice would be to point at the problematic cases in the
> Javadoc (and possibly the user's guide), and live with them.
> Regarding [2], would there be any performance gain if we designed an
> immutable sparse vector/matrix. I guess it would. The question is: do we
> need these objects to be mutable?

I don't think so. And certainly not for the usage in the optimizers.

Anyways, I assume that it would be quite useful to have a new matrix
hierarchy (with more types than we have now, and appropriately optimized for
storage and/or speed). And a good starting point would be to assume
immutability.


Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Ted Dunning <te...@gmail.com>.

Dim has it exactly right here.

On Sun, Dec 30, 2012 at 10:38 AM, Dimitri Pourbaix <pourbaix@astro.ulb.ac.be
> wrote:

> In optimization, the user supplies the function to be minimised.  In curve
> fitting, the user supplies a series of observations and the model to be
> fitted.  Trying to combine both into a unique scheme (how highly abstract
> it is) is simply misleading.
>

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Hi,

> In my opinion is that the whole weights fiasco is consequence of
> improper design, as much as anything else. All components should be
> as simple as possible, with any additional add ons, like weights,
> not added to the base implementation, but instead done as an extension
> of these classes. If these was done, all the optimisation packages
> would have been just fine, and only the extension classes would need
> fixing.
>
> I am very against having a correlation function be an input to the
> basic optimiser. The eigendecomposition of the matrix is an O(N^3)
> operation, which could be actually more expensive than the whole
> optimization. In addition you are doing this inversion every time you
> call the optimize method. If you are trying to do multiple starting
> points, you are forcing an inversion to be done each time.

Rather than (or besides) an improper design, CM also tries to combine
pears and apples in a single scheme.

A typical unconstrained optimization problem assumes g, a function from
R^n to R.  A general optimization algorithm takes g for sure.  Depending
on the nature of g (continuous, differentiable, linear or not, ...),
the different algorithms might need more (g, its gradient, its Hessian
matrix, a starting point, ...).  There is no notion of weight there.

Curve fitting (whether is the 1-norm, 2-norm or infinity-norm) can be
seen as a minimisation problem in which case. once again, one needs g
(e.g. the sum of the residual squared, ...), its gradient, a starting
point, ... depending on the algorithm adopted.  However, some specific
algorithms are designed for curve fitting only, e.g. Levenberg-Marquardt,
and rely upon the availability of the model function f, the series of
observations, a starting point, a weighting scheme, ...

In optimization, the user supplies the function to be minimised.  In curve
fitting, the user supplies a series of observations and the model to be
fitted.  Trying to combine both into a unique scheme (how highly abstract
it is) is simply misleading.

Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Hi,

In my opinion is that the whole weights fiasco is consequence of improper design, as much as anything else. All components should be as simple as possible, with any additional add ons, like weights, not added to the base implementation, but instead done as an extension of these classes. If these was done, all the optimization packages would have been just fine, and only the extension classes would need fixing. 

I am very against having a correlation function be an input to the basic optimizers. The eigendecomposition of the matrix is an O(N^3) operation, which could be actually more expensive than the whole optimization. In addition you are doing this inversion every time you call the optimize method. If you are trying to do multiple starting points, you are forcing an inversion to be done each time.


On Dec 29, 2012, at 7:29 AM, Gilles Sadowski <gi...@harfang.homelinux.org> wrote:

> On Sat, Dec 29, 2012 at 10:22:20AM +0100, Dimitri Pourbaix wrote:
>> Gilles,
>> 
>>> Handling weighted observations must take correlations into account, i.e. use
>>> a _matrix_.
>>> There is the _practical_ problem of memory. Solving it correctly is by
>>> using a sparse implementation (and this is actually an implementation
>>> _detail_).
>> 
>> The problem is where something becomes a detail!  You are right that the
>> general least square problem copes with a matrix of weights ... but the
>> way it is implemented is a detail.
> 
> That's what I said above, although I suspect that we don't mean the same
> thing. OO programming allows to define types that will represent the
> "real" concepts: in this case, if the problem is expressed in terms of a
> a (mathematical) matrix, the algorithm should use a "Matrix" (type).
> This is not an implementation detail; the goal is for the code to be as
> close as possible to the mathematical description of the procedure
> (self-documenting code).
> 
> The implementation detail is how the matrix type stores its data internally;
> and this can be the subject of any necessary efficiency improvements,
> independently of the matrix concept used at a higher level (e.g. in the
> optimization algorithms).
> 
>> As already pointed out, even the
>> vector of weights API allows for a complicated matrix of weights.  The user
>> premultiplies by the 'square root' of that matrix and sets all the compo-
>> nents of the weight vector to 1.  So, your enthusiasm to generalise the
>> vector of weights to a matrix was a detail to make the life of very few
>> users easier ... without adding any functionality.
> 
> This is a backward description of my change.
> In reality:
> 1. The handling of weights was there.
> 2. Assuming that people wanted to keep it, I added the functionality to
>   handle correlated observations.
> 
> If indeed the weight feature is independent of the optimization procedure,
> then _all_ references to weights should be banned.
> [If just because keeping an array of "ones" and doing loops that "multiply
> by one" are obviously not going to improve clarity and performance.]
> Eventually, this seems to be the accepted compromise now (IIUC).
> 
>> There are so many different configurations (e.g. block diagonal, ...), I
>> doubt you can handle all of them in the most efficient way
> 
> Actually, my "Weight" class trivially handles _any_ "RealMatrix" (thanks to
> inheritance!).
> 
>> so it is likely
>> preferable to have the user taking care of them.
> 
> This is exactly what "Weight" does.
> The problem is that CM does not provide efficient implementations for
> matrix forms suited for this context (symmetric, sparse, diagonal).[1]
> 
> Above and in the previous post, I agreed that this would not be a problem i
> we entirely drop the support for weights in the optimizers.
> 
>> It is however true that simple weights (i.e. vector form) are a very usual
>> situation ... which is also very common in fitting tools.  So, I think CM
>> should offer that approach as well.
> 
> Where? In the fitting tools or in the optimizers?
> We just said that weights could be handled independenttly from the
> optimization procedure. But we could indeed put weights back where they are
> most useful (e.g. in the curve fitting) without dragging everywhere (where
> most of the time they'd be equal to one...).
> 
>> In conclusion: the old CM 3.0 API was enough! :)
> 
> If that's so, then people can just copy/paste the source code of that
> version and not care about subsequent versions of CM.
> 
> 
> Cordially,
> Gilles
> 
> [1] Actually, the problem is that some people complain that we don't do
>    enough to their taste: In the past, at least 3 persons raised issues
>    with matrix implementations, but without providing any useful help,
>    unfortunately (to be clear, I'm not talking of current contributors!).
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

On Sat, Dec 29, 2012 at 10:22:20AM +0100, Dimitri Pourbaix wrote:
> Gilles,
> 
> >Handling weighted observations must take correlations into account, i.e. use
> >a _matrix_.
> >There is the _practical_ problem of memory. Solving it correctly is by
> >using a sparse implementation (and this is actually an implementation
> >_detail_).
> 
> The problem is where something becomes a detail!  You are right that the
> general least square problem copes with a matrix of weights ... but the
> way it is implemented is a detail.

That's what I said above, although I suspect that we don't mean the same
thing. OO programming allows to define types that will represent the
"real" concepts: in this case, if the problem is expressed in terms of a
a (mathematical) matrix, the algorithm should use a "Matrix" (type).
This is not an implementation detail; the goal is for the code to be as
close as possible to the mathematical description of the procedure
(self-documenting code).

The implementation detail is how the matrix type stores its data internally;
and this can be the subject of any necessary efficiency improvements,
independently of the matrix concept used at a higher level (e.g. in the
optimization algorithms).

> As already pointed out, even the
> vector of weights API allows for a complicated matrix of weights.  The user
> premultiplies by the 'square root' of that matrix and sets all the compo-
> nents of the weight vector to 1.  So, your enthusiasm to generalise the
> vector of weights to a matrix was a detail to make the life of very few
> users easier ... without adding any functionality.

This is a backward description of my change.
In reality:
1. The handling of weights was there.
2. Assuming that people wanted to keep it, I added the functionality to
   handle correlated observations.

If indeed the weight feature is independent of the optimization procedure,
then _all_ references to weights should be banned.
[If just because keeping an array of "ones" and doing loops that "multiply
by one" are obviously not going to improve clarity and performance.]
Eventually, this seems to be the accepted compromise now (IIUC).

> There are so many different configurations (e.g. block diagonal, ...), I
> doubt you can handle all of them in the most efficient way

Actually, my "Weight" class trivially handles _any_ "RealMatrix" (thanks to
inheritance!).

> so it is likely
> preferable to have the user taking care of them.

This is exactly what "Weight" does.
The problem is that CM does not provide efficient implementations for
matrix forms suited for this context (symmetric, sparse, diagonal).[1]

Above and in the previous post, I agreed that this would not be a problem i
we entirely drop the support for weights in the optimizers.

> It is however true that simple weights (i.e. vector form) are a very usual
> situation ... which is also very common in fitting tools.  So, I think CM
> should offer that approach as well.

Where? In the fitting tools or in the optimizers?
We just said that weights could be handled independenttly from the
optimization procedure. But we could indeed put weights back where they are
most useful (e.g. in the curve fitting) without dragging everywhere (where
most of the time they'd be equal to one...).

> In conclusion: the old CM 3.0 API was enough! :)

If that's so, then people can just copy/paste the source code of that
version and not care about subsequent versions of CM.

Cordially,
Gilles

[1] Actually, the problem is that some people complain that we don't do
    enough to their taste: In the past, at least 3 persons raised issues
    with matrix implementations, but without providing any useful help,
    unfortunately (to be clear, I'm not talking of current contributors!).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Gilles,

> Handling weighted observations must take correlations into account, i.e. use
> a _matrix_.
> There is the _practical_ problem of memory. Solving it correctly is by
> using a sparse implementation (and this is actually an implementation
> _detail_).

The problem is where something becomes a detail!  You are right that the
general least square problem copes with a matrix of weights ... but the
way it is implemented is a detail.  As already pointed out, even the
vector of weights API allows for a complicated matrix of weights.  The user
premultiplies by the 'square root' of that matrix and sets all the compo-
nents of the weight vector to 1.  So, your enthusiasm to generalise the
vector of weights to a matrix was a detail to make the life of very few
users easier ... without adding any functionality.

There are so many different configurations (e.g. block diagonal, ...), I
doubt you can handle all of them in the most efficient way so it is likely
preferable to have the user taking care of them.

It is however true that simple weights (i.e. vector form) are a very usual
situation ... which is also very common in fitting tools.  So, I think CM
should offer that approach as well.

In conclusion: the old CM 3.0 API was enough! :)

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <lu...@spaceroots.org>.

Le 29/12/2012 10:08, Luc Maisonobe a écrit :
> Le 29/12/2012 04:09, Gilles Sadowski a écrit :
>> Hi.
> 
> Hi Gilles,
> 
>>
>>> [...]
>>
>>>> third is just bug-fix.  Does the fix for the issue that started this
>>>> thread change the API?  If so, we would need to cut 3.2 in any case.
>>
>> The current fix does change the usage (one needs to call optimize with an
>> argument of a different type). IIUC, it also silently removes the handling
>> of uncorrelated observations.
>>
>>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>>> and committed the fix as of r1426616.
>>>
>>> Could someone please review what I have done?
>>
>> I don't like the fix...
> 
> Thanks for reviewing.
> 
>>
>> Handling weighted observations must take correlations into account, i.e. use
>> a _matrix_.
>> There is the _practical_ problem of memory. Solving it correctly is by using
>> a sparse implementation (and this is actually an implementation _detail_).
> 
> Yes.
> 
>> If we _need_ such an implementation to solve the practical problem, I
>> strongly suggest that we focus on creating it (or fixing what CM already
>> has, or accept that some inconsistency will be present), rather than
>> reducing the code applicability (i.e. allowing only uncorrelated data).
>> If the observations are not correlated, the matrix is a diagonal matrix, not
>> a vector.
> 
> It's fine with me. I simply thought it wouldn't be that easy. You proved
> me wrong.
> 
>> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
>> which all would go some way to solving several practical problems of the same
>> kind as the current one without sacrificing generality.
> 
> Yes, we have known that for years.
> 
>>
>> Now, and several years ago, it was noticed that CM does not _have_ to
>> provide the "weights" feature because users can handle this before they
>> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
>> IMO, the other valid option is thus to have a simpler version of the
>> algorithm, but still a correct one.
>> This would also have the advantage that we won't have the urgent need to
>> keep the sparse matrix implementation.
>> [Then, if at some point we include helper code to handle weights
>> (_including_ correlations), we should also do it in a "preprocessing" step,
>> without touching the optimization algorithms.]
> 
> So what do you suggest: keep the current support (with proper handling
> as you did) or drop it?
> 
> Since several people asked for removing it (Dimitri, Konstantin and now
> you), we can do that. Unitl now, this feature was a convenience that was
> really useful for some cases, and it was simple. There were some errors
> in it and Dimitri solved them in 2010, but no other problems appeared
> since them, so it made sense simply keeping it as is. Now we are hit by
> a second problem, and it seems it opens a can of worm. Dropping it
> completely as a not so useful convenience which is tricky to set up
> right would be fair.
> 
>>
>>> I also think (but did not test it), that there may be a problem with
>>> missing OptimizationData. If someone call the optimizer but forget to
>>> set the weight (or the target, or some other mandatory parameters), then
>>> I'm not sure we fail with an appropriate error. Looking for example at
>>> the private checkParameters method in the MultivariateVectorOptimizer
>>> abstract class, I guess we may have a NullPointer Exception if either
>>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>>> parseOptimizationData method. Could someone confirm this?
>>
>> Yes.
>> And this (not checking for missing data) _could_ be considered a feature, as
>> I stressed several times on this ML, and in the code documentation
>> (eliciting zero constructive comment).
>> We also _agreed_ that users not passing needed data will result in NPE.
>> [I imagined that applications would check that valid and complete input is
>> passed to the lower level "optimize(OptimizationData ...)" methods.]
> 
> Yes, abd I agree with that. However, I found the javadoc to be
> ambiguous. It says "The following data will be looked for:" followed by
> a list. There is no distinction between optional and required
> parameters. As an example, simple bounds are optional whereas initial
> guess or weight are required, but there is nothing to tell it to user.
> So in this case, either we should provide proper exception or proper
> documentation. I am OK with both.
> 
>>
>> It is however straightforward to add a "checkParameters()" method that would
>> raise more specific exceptions. [Although that would contradict the
>> conclusion of the previous discussion about NPE in CM. And restart it,
>> without even getting a chance to go forward with what had been decided!]
> 
> As long as we identify the parameters that are optional (and hence user
> can deduce the other one are mandatory and will raise an NPE), this
> would be fine. I don't ask to restart this tiring discussion, just make
> sure users have the proper way to understand why they get an NPE when
> the forget weight and why they don't get one when the forget simple bounds.
> 
> Also weight should really be optional and have a fallback to identity
> matrix, but this is another story.
> 
>>
>>
>> Hence, to summarize:
>>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
> 
> +1

I forgot to ask: do you want me to first revert my change before you
commit yours or will you do both at the same time?

Luc

> 
>>  * We must decide wether to deprecate the weights feature in that release
>>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>>    un-deprecate "OpenMapRealMatrix"[2]).
> 
> +1 to deprecate.
> 
> best regards
> Luc
> 
>>
>>
>> Best regards,
>> Gilles
>>
>> [1] The inconsistencies that led to the deprecation will have no bearing
>>     on usage within the optimizers.
>> [2] The latter option seems likely to entail a fair amount of work to
>>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>>     for some operations, e.g. "multiply").
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Sébastien Brisard <se...@m4x.org>.

Hi,


2012/12/29 Gilles Sadowski <gi...@harfang.homelinux.org>

> Hi.
>
> > [...]
>
> > > third is just bug-fix.  Does the fix for the issue that started this
> > > thread change the API?  If so, we would need to cut 3.2 in any case.
>
> The current fix does change the usage (one needs to call optimize with an
> argument of a different type). IIUC, it also silently removes the handling
> of uncorrelated observations.
>
> > Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
> > and committed the fix as of r1426616.
> >
> > Could someone please review what I have done?
>
> I don't like the fix...
>
> Handling weighted observations must take correlations into account, i.e.
> use
> a _matrix_.
> There is the _practical_ problem of memory. Solving it correctly is by
> using
> a sparse implementation (and this is actually an implementation _detail_).
> If we _need_ such an implementation to solve the practical problem, I
> strongly suggest that we focus on creating it (or fixing what CM already
> has, or accept that some inconsistency will be present), rather than
> reducing the code applicability (i.e. allowing only uncorrelated data).
> If the observations are not correlated, the matrix is a diagonal matrix,
> not
> a vector.
> CM also lacks implementations for symmetric, triangular, and diagonal
> matrix,
> which all would go some way to solving several practical problems of the
> same
> kind as the current one without sacrificing generality.
>
> Now, and several years ago, it was noticed that CM does not _have_ to
> provide the "weights" feature because users can handle this before they
> call CM code. [IIRC, no CM unit test actually use weights different from
> 1.]
> IMO, the other valid option is thus to have a simpler version of the
> algorithm, but still a correct one.
> This would also have the advantage that we won't have the urgent need to
> keep the sparse matrix implementation.
> [Then, if at some point we include helper code to handle weights
> (_including_ correlations), we should also do it in a "preprocessing" step,
> without touching the optimization algorithms.]
>
> > I also think (but did not test it), that there may be a problem with
> > missing OptimizationData. If someone call the optimizer but forget to
> > set the weight (or the target, or some other mandatory parameters), then
> > I'm not sure we fail with an appropriate error. Looking for example at
> > the private checkParameters method in the MultivariateVectorOptimizer
> > abstract class, I guess we may have a NullPointer Exception if either
> > Target or Weight/NonCorrelatedWeight has not been parsed in the
> > parseOptimizationData method. Could someone confirm this?
>
> Yes.
> And this (not checking for missing data) _could_ be considered a feature,
> as
> I stressed several times on this ML, and in the code documentation
> (eliciting zero constructive comment).
> We also _agreed_ that users not passing needed data will result in NPE.
> [I imagined that applications would check that valid and complete input is
> passed to the lower level "optimize(OptimizationData ...)" methods.]
>
> It is however straightforward to add a "checkParameters()" method that
> would
> raise more specific exceptions. [Although that would contradict the
> conclusion of the previous discussion about NPE in CM. And restart it,
> without even getting a chance to go forward with what had been decided!]
>
>
> Hence, to summarize:
>  * The fix, in a 3.2 release, should be to replace the matrix
> implementation
>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1]
> or
>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
>  * We must decide wether to deprecate the weights feature in that release
>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>    un-deprecate "OpenMapRealMatrix"[2]).
>
>
> Best regards,
> Gilles
>
> [1] The inconsistencies that led to the deprecation will have no bearing
>     on usage within the optimizers.
> [2] The latter option seems likely to entail a fair amount of work to
>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>     for some operations, e.g. "multiply").
>
> I have never been happy with deprecating the sparse implementations of
vectors and matrices. I also think we should "undeprecate" them and work on
them, with some help!
Regarding [1], my choice would be to point at the problematic cases in the
Javadoc (and possibly the user's guide), and live with them.
Regarding [2], would there be any performance gain if we designed an
immutable sparse vector/matrix. I guess it would. The question is: do we
need these objects to be mutable?

Sébastien

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/29/12 1:08 AM, Luc Maisonobe wrote:
> Le 29/12/2012 04:09, Gilles Sadowski a écrit :
>> Hi.
> Hi Gilles,
>
>>> [...]
>>>> third is just bug-fix.  Does the fix for the issue that started this
>>>> thread change the API?  If so, we would need to cut 3.2 in any case.
>> The current fix does change the usage (one needs to call optimize with an
>> argument of a different type). IIUC, it also silently removes the handling
>> of uncorrelated observations.
>>
>>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>>> and committed the fix as of r1426616.
>>>
>>> Could someone please review what I have done?
>> I don't like the fix...
> Thanks for reviewing.
>
>> Handling weighted observations must take correlations into account, i.e. use
>> a _matrix_.
>> There is the _practical_ problem of memory. Solving it correctly is by using
>> a sparse implementation (and this is actually an implementation _detail_).
> Yes.
>
>> If we _need_ such an implementation to solve the practical problem, I
>> strongly suggest that we focus on creating it (or fixing what CM already
>> has, or accept that some inconsistency will be present), rather than
>> reducing the code applicability (i.e. allowing only uncorrelated data).
>> If the observations are not correlated, the matrix is a diagonal matrix, not
>> a vector.
> It's fine with me. I simply thought it wouldn't be that easy. You proved
> me wrong.
>
>> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
>> which all would go some way to solving several practical problems of the same
>> kind as the current one without sacrificing generality.
> Yes, we have known that for years.
>
>> Now, and several years ago, it was noticed that CM does not _have_ to
>> provide the "weights" feature because users can handle this before they
>> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
>> IMO, the other valid option is thus to have a simpler version of the
>> algorithm, but still a correct one.
>> This would also have the advantage that we won't have the urgent need to
>> keep the sparse matrix implementation.
>> [Then, if at some point we include helper code to handle weights
>> (_including_ correlations), we should also do it in a "preprocessing" step,
>> without touching the optimization algorithms.]
> So what do you suggest: keep the current support (with proper handling
> as you did) or drop it?
>
> Since several people asked for removing it (Dimitri, Konstantin and now
> you), we can do that. Unitl now, this feature was a convenience that was
> really useful for some cases, and it was simple. There were some errors
> in it and Dimitri solved them in 2010, but no other problems appeared
> since them, so it made sense simply keeping it as is. Now we are hit by
> a second problem, and it seems it opens a can of worm. Dropping it
> completely as a not so useful convenience which is tricky to set up
> right would be fair.
>
>>> I also think (but did not test it), that there may be a problem with
>>> missing OptimizationData. If someone call the optimizer but forget to
>>> set the weight (or the target, or some other mandatory parameters), then
>>> I'm not sure we fail with an appropriate error. Looking for example at
>>> the private checkParameters method in the MultivariateVectorOptimizer
>>> abstract class, I guess we may have a NullPointer Exception if either
>>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>>> parseOptimizationData method. Could someone confirm this?
>> Yes.
>> And this (not checking for missing data) _could_ be considered a feature, as
>> I stressed several times on this ML, and in the code documentation
>> (eliciting zero constructive comment).
>> We also _agreed_ that users not passing needed data will result in NPE.
>> [I imagined that applications would check that valid and complete input is
>> passed to the lower level "optimize(OptimizationData ...)" methods.]
> Yes, abd I agree with that. However, I found the javadoc to be
> ambiguous. It says "The following data will be looked for:" followed by
> a list. There is no distinction between optional and required
> parameters. As an example, simple bounds are optional whereas initial
> guess or weight are required, but there is nothing to tell it to user.
> So in this case, either we should provide proper exception or proper
> documentation. I am OK with both.
>
>> It is however straightforward to add a "checkParameters()" method that would
>> raise more specific exceptions. [Although that would contradict the
>> conclusion of the previous discussion about NPE in CM. And restart it,
>> without even getting a chance to go forward with what had been decided!]
> As long as we identify the parameters that are optional (and hence user
> can deduce the other one are mandatory and will raise an NPE), this
> would be fine. I don't ask to restart this tiring discussion, just make
> sure users have the proper way to understand why they get an NPE when
> the forget weight and why they don't get one when the forget simple bounds.

+1 - can we just make it clear in the javadoc what is optional, what
is required and add tests to illustrate?  I assume it is intractable
to just add full signatures for the activations that "work?"
>
> Also weight should really be optional and have a fallback to identity
> matrix, but this is another story.

+1 - and while I have not yet really penetrated this code, I have
sympathy for Konstantin's views that it might be better to actually
separate the impls.  I don't want to open a can of worms forcing us
to add yet more complexity by merging impls; but we implemented a
special case of least squares in the stat.regression, where we
separated GLS from OLS, allowing the OLS code to be simpler. 
Separation turned out to make things easier both for users and us
there.  The setup is not perfect, but it is documentable and
maintainable.  Separation also had the benefit that bugs and lack of
specification clarity in early versions of GLS did not impact OLS.

Can we agree at this point that Gilles' commit in r1426758 resolves
the issue that started this thread (MATH-924)?

I am also +1 on the following:

0) continuing in the direction started in r1426758, adding
specialized matrix impls
1) "undeprecating" the sparse impls
2) moving to in-place vector/matrix operations within
implementations as much as possible

Item 2) creates some tension with the objective of having the impls
reflect the mathematics (Gilles good point about "self-documenting
code"), but there may be clever ways to maintain readability / good
OO structure while achieving large efficiency gains using the
visitor pattern or Konstantin's suggestion of an InPlace interface. 
Lets look at the Mahout classes for some inspiration there.  IIUC
the proposals, this also creates tension with the objective to favor
immutability.

Regarding 1), any suggestions on which issues to work on /
approaches to try first?

Phil
>
>>
>> Hence, to summarize:
>>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
> +1
>
>>  * We must decide wether to deprecate the weights feature in that release
>>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>>    un-deprecate "OpenMapRealMatrix"[2]).
> +1 to deprecate.
>
> best regards
> Luc
>
>>
>> Best regards,
>> Gilles
>>
>> [1] The inconsistencies that led to the deprecation will have no bearing
>>     on usage within the optimizers.
>> [2] The latter option seems likely to entail a fair amount of work to
>>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>>     for some operations, e.g. "multiply").
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 29/12/2012 04:09, Gilles Sadowski a écrit :
> Hi.

Hi Gilles,

> 
>> [...]
> 
>>> third is just bug-fix.  Does the fix for the issue that started this
>>> thread change the API?  If so, we would need to cut 3.2 in any case.
> 
> The current fix does change the usage (one needs to call optimize with an
> argument of a different type). IIUC, it also silently removes the handling
> of uncorrelated observations.
> 
>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>> and committed the fix as of r1426616.
>>
>> Could someone please review what I have done?
> 
> I don't like the fix...

Thanks for reviewing.

> 
> Handling weighted observations must take correlations into account, i.e. use
> a _matrix_.
> There is the _practical_ problem of memory. Solving it correctly is by using
> a sparse implementation (and this is actually an implementation _detail_).

Yes.

> If we _need_ such an implementation to solve the practical problem, I
> strongly suggest that we focus on creating it (or fixing what CM already
> has, or accept that some inconsistency will be present), rather than
> reducing the code applicability (i.e. allowing only uncorrelated data).
> If the observations are not correlated, the matrix is a diagonal matrix, not
> a vector.

It's fine with me. I simply thought it wouldn't be that easy. You proved
me wrong.

> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
> which all would go some way to solving several practical problems of the same
> kind as the current one without sacrificing generality.

Yes, we have known that for years.

> 
> Now, and several years ago, it was noticed that CM does not _have_ to
> provide the "weights" feature because users can handle this before they
> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
> IMO, the other valid option is thus to have a simpler version of the
> algorithm, but still a correct one.
> This would also have the advantage that we won't have the urgent need to
> keep the sparse matrix implementation.
> [Then, if at some point we include helper code to handle weights
> (_including_ correlations), we should also do it in a "preprocessing" step,
> without touching the optimization algorithms.]

So what do you suggest: keep the current support (with proper handling
as you did) or drop it?

Since several people asked for removing it (Dimitri, Konstantin and now
you), we can do that. Unitl now, this feature was a convenience that was
really useful for some cases, and it was simple. There were some errors
in it and Dimitri solved them in 2010, but no other problems appeared
since them, so it made sense simply keeping it as is. Now we are hit by
a second problem, and it seems it opens a can of worm. Dropping it
completely as a not so useful convenience which is tricky to set up
right would be fair.

> 
>> I also think (but did not test it), that there may be a problem with
>> missing OptimizationData. If someone call the optimizer but forget to
>> set the weight (or the target, or some other mandatory parameters), then
>> I'm not sure we fail with an appropriate error. Looking for example at
>> the private checkParameters method in the MultivariateVectorOptimizer
>> abstract class, I guess we may have a NullPointer Exception if either
>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>> parseOptimizationData method. Could someone confirm this?
> 
> Yes.
> And this (not checking for missing data) _could_ be considered a feature, as
> I stressed several times on this ML, and in the code documentation
> (eliciting zero constructive comment).
> We also _agreed_ that users not passing needed data will result in NPE.
> [I imagined that applications would check that valid and complete input is
> passed to the lower level "optimize(OptimizationData ...)" methods.]

Yes, abd I agree with that. However, I found the javadoc to be
ambiguous. It says "The following data will be looked for:" followed by
a list. There is no distinction between optional and required
parameters. As an example, simple bounds are optional whereas initial
guess or weight are required, but there is nothing to tell it to user.
So in this case, either we should provide proper exception or proper
documentation. I am OK with both.

> 
> It is however straightforward to add a "checkParameters()" method that would
> raise more specific exceptions. [Although that would contradict the
> conclusion of the previous discussion about NPE in CM. And restart it,
> without even getting a chance to go forward with what had been decided!]

As long as we identify the parameters that are optional (and hence user
can deduce the other one are mandatory and will raise an NPE), this
would be fine. I don't ask to restart this tiring discussion, just make
sure users have the proper way to understand why they get an NPE when
the forget weight and why they don't get one when the forget simple bounds.

Also weight should really be optional and have a fallback to identity
matrix, but this is another story.

> 
> 
> Hence, to summarize:
>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.

+1

>  * We must decide wether to deprecate the weights feature in that release
>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>    un-deprecate "OpenMapRealMatrix"[2]).

+1 to deprecate.

best regards
Luc

> 
> 
> Best regards,
> Gilles
> 
> [1] The inconsistencies that led to the deprecation will have no bearing
>     on usage within the optimizers.
> [2] The latter option seems likely to entail a fair amount of work to
>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>     for some operations, e.g. "multiply").
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

Hi.

> [...]

> > third is just bug-fix.  Does the fix for the issue that started this
> > thread change the API?  If so, we would need to cut 3.2 in any case.

The current fix does change the usage (one needs to call optimize with an
argument of a different type). IIUC, it also silently removes the handling
of uncorrelated observations.

> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
> and committed the fix as of r1426616.
> 
> Could someone please review what I have done?

I don't like the fix...

Handling weighted observations must take correlations into account, i.e. use
a _matrix_.
There is the _practical_ problem of memory. Solving it correctly is by using
a sparse implementation (and this is actually an implementation _detail_).
If we _need_ such an implementation to solve the practical problem, I
strongly suggest that we focus on creating it (or fixing what CM already
has, or accept that some inconsistency will be present), rather than
reducing the code applicability (i.e. allowing only uncorrelated data).
If the observations are not correlated, the matrix is a diagonal matrix, not
a vector.
CM also lacks implementations for symmetric, triangular, and diagonal matrix,
which all would go some way to solving several practical problems of the same
kind as the current one without sacrificing generality.

Now, and several years ago, it was noticed that CM does not _have_ to
provide the "weights" feature because users can handle this before they
call CM code. [IIRC, no CM unit test actually use weights different from 1.]
IMO, the other valid option is thus to have a simpler version of the
algorithm, but still a correct one.
This would also have the advantage that we won't have the urgent need to
keep the sparse matrix implementation.
[Then, if at some point we include helper code to handle weights
(_including_ correlations), we should also do it in a "preprocessing" step,
without touching the optimization algorithms.]

> I also think (but did not test it), that there may be a problem with
> missing OptimizationData. If someone call the optimizer but forget to
> set the weight (or the target, or some other mandatory parameters), then
> I'm not sure we fail with an appropriate error. Looking for example at
> the private checkParameters method in the MultivariateVectorOptimizer
> abstract class, I guess we may have a NullPointer Exception if either
> Target or Weight/NonCorrelatedWeight has not been parsed in the
> parseOptimizationData method. Could someone confirm this?

Yes.
And this (not checking for missing data) _could_ be considered a feature, as
I stressed several times on this ML, and in the code documentation
(eliciting zero constructive comment).
We also _agreed_ that users not passing needed data will result in NPE.
[I imagined that applications would check that valid and complete input is
passed to the lower level "optimize(OptimizationData ...)" methods.]

It is however straightforward to add a "checkParameters()" method that would
raise more specific exceptions. [Although that would contradict the
conclusion of the previous discussion about NPE in CM. And restart it,
without even getting a chance to go forward with what had been decided!]


Hence, to summarize:
 * The fix, in a 3.2 release, should be to replace the matrix implementation
   with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
   "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
 * We must decide wether to deprecate the weights feature in that release
   (and thus remove it in 4.0) or to keep it in its general form (and thus
   un-deprecate "OpenMapRealMatrix"[2]).


Best regards,
Gilles

[1] The inconsistencies that led to the deprecation will have no bearing
    on usage within the optimizers.
[2] The latter option seems likely to entail a fair amount of work to
    improve the performance of "OpenMapRealMatrix" (which is quite poor
    for some operations, e.g. "multiply").

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 28/12/2012 19:47, Phil Steitz a écrit :
> On 12/28/12 10:35 AM, Luc Maisonobe wrote:
>> Le 28/12/2012 19:11, Phil Steitz a écrit :
>>> On 12/28/12 9:58 AM, Luc Maisonobe wrote:
>>>> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
>>>>> Hi,
>>>>>
>>>>> I can understand Dimitri's frustration, it seems the optimization
>>>>> framework gets worse with every iteration. However, we should
>>>>> probably look forward and think about how to design it properly
>>>>> instead.
>>>>>
>>>>> Several times I brought out some problems and ideas about the package
>>>>> and it seems the only person who has an opinion is Gilles.
>>>> Several people contributed to the thread (see
>>>> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
>>>> pointed out in one of the message, we lack an optimization expert. I
>>>> sincerely would not want my opinion to be taken too seriously on this,
>>>> so I only expressed what I could and did not decide anything by myself
>>>> (only proposed to remove the wrong binding with DerivativeStructure,
>>>> which has since been done).
>>>>
>>>>> I will list what I consider to be major problems
>>>>> 1) The OO design is bad, too much inheritance which could be handled
>>>>> better by interfaces, the structure has no relation to the actual way
>>>>> parts of optimizers can be mixed and matched. Input functions should
>>>>> also have clear inheritance structure and should return both the
>>>>> value, gradient, hessian in one function call.
>>>> I strongly agree with Konstantin here. Abstract classes allow to add
>>>> methods without breaking compatibility (which is good), but they also
>>>> have some drawbacks (we have seen one drawback with the parameterized
>>>> classes a few months ago, and this huge hierarchy is another one). So
>>>> there is no silver bullet and we keep trying to find the good balance.
>>>> As far as I am concerned, I would prefer we get fewer abstract classes,
>>>> we remove some intermediate level (I don't know which ones), and we use
>>>> more delegation than inheritance.
>>>>
>>>>> 2) Incorrect handling of constraints. There are only something like 5
>>>>> possible constraints possible in optimization, with each
>>>>> implementation of the solver handling some but not all. There is no
>>>>> need to this runtime approach, which creates incredible amount of
>>>>> confusion. All the possible constraints should be explicitly given in
>>>>> the parameters to a function call, there are only 5. In addition,
>>>>> constraints should be pre-processed a priori! So they should be an
>>>>> input to the constructor not to the optimization function call.
>>>> Our implementation for constraints is really limited (once again, scarce
>>>> resources). What are the 5 types you consider? Simple/double bounds on
>>>> parameters, linear/non-linear bounds and equality?
>>>>
>>>>> 3) Linear algebra package should be used as an interface and
>>>>> internally to increase performance for larger datasets. Data copying
>>>>> should be avoided as much as possible.
>>>> Yes, but this would require solving another sub-problem first: having a
>>>> decent sparse linear algebra implementation, which we also lack. Our
>>>> implementation for full matrices was also in a sorry state prior to 3.0,
>>>> but now fortunately this has improved at least for systems up to a few
>>>> thousands rows and columns (so at least we do make progress on some points).
>>>>
>>>>> 4) Testing should be done on larger problems.
>>>> Yes. I guess there are some general well known problems for that, so we
>>>> should get a few of them and implement them. We did implement a number
>>>> of tests from Minpack, but they focused on difficult cases rather than
>>>> on problem size. I think optimization has a good testing coverage, but
>>>> clearly large size problems is a needed addition.
>>>>
>>>>> I know the response is that I am free to go implemented, but I think
>>>>> we should at least agree on design principles instead of pushing
>>>>> through your own ideas because the other person is too busy. The only
>>>>> discussion we ever had on this was between me and Gilles, everyone
>>>>> else disappeared.
>>>> Well, we tried to keep up as our skills allowed, and we were also
>>>> concerned with 3.1 being released at the same time.
>>>>
>>>> We have more time now than we had a few weeks ago. This is an
>>>> opportunity to restart the discussion. We can refrain from pushing a new
>>>> release (despite I would like this bug fix to be released officially)
>>>> and take some time to think calmly. We could also push 3.2 with only the
>>>> fix
>>> What about 3.1.1 with just the fix for this, then possibly 3.2 or
>>> direct to 4.0.
>> As you want. I don't like adding too many sub-release numbers, just as
>> if someone fears jumping to next version. I remember some Sun products
>> and Perl versions that end up with something like 5 release digits.
>> Well, I don't really care, so it can be 3.1.1 if people prefer that.
> 
> I would personally like to see us move in that direction.  Now that
> Gilles has done the hard work to get us to a pretty well-automated
> release process, it should not be that hard for us to crank
> releases.  I have always admired the tomcat community's approach to
> releases - full automation and regular dot releases.  This enables
> them to fix issues found in releases quickly and keep progressing. 
> Last I checked, they just used three levels.  The Commons versioning
> guidelines allow and support this - first number is major (can break
> compat) second is minor (can add new features and APIs compatibly),
> third is just bug-fix.  Does the fix for the issue that started this
> thread change the API?  If so, we would need to cut 3.2 in any case.

Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
and committed the fix as of r1426616.

Could someone please review what I have done?

I also think (but did not test it), that there may be a problem with
missing OptimizationData. If someone call the optimizer but forget to
set the weight (or the target, or some other mandatory parameters), then
I'm not sure we fail with an appropriate error. Looking for example at
the private checkParameters method in the MultivariateVectorOptimizer
abstract class, I guess we may have a NullPointer Exception if either
Target or Weight/NonCorrelatedWeight has not been parsed in the
parseOptimizationData method. Could someone confirm this?

best regards,
Luc

> 
> Phil
>>
>> Luc
>>
>>> Phil
>>>>  and without any revamp and start thinking about 4.0 with a redesign
>>>> of these two main area: optimization and sparse linear algebra.
>>>>
>>>> If you could contribute to this discussion understanding we are not
>>>> experts of this field and we cannot do it by ourselves, it would be great.
>>>>
>>>> best regards,
>>>> Luc
>>>>
>>>>> Thanks, Konstantin
>>>>>
>>>>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>>>>>> Luc,
>>>>>>>
>>>>>>>> So in order to make sure I understand your point, you would be
>>>>>>>> OK if I deprecate the non-diagonal weights, in which case users
>>>>>>>> needing this would have to implement it themselves by
>>>>>>>> premultiplication (as both you and Konstantin seem to
>>>>>>>> propose)?
>>>>>>> Yes, exactly.
>>>>>>>
>>>>>>>> Sure, but for the record the feature was also a last minute 
>>>>>>>> change. This was discussed on the list, and the final decision
>>>>>>>> was to add this feature despite the release was close. No
>>>>>>>> wonder we failed to test it thoroughsly.
>>>>>>> Last minute?  I have been discussing this with Gilles for
>>>>>>> several months.
>>>>>> Relevant project discussion happens *on this list*
>>>>>>>> We don't expect our releases to be perfect. We do our best,
>>>>>>>> with the resources we have.
>>>>>>> I perfectly understand this but focusing those resources less on 
>>>>>>> rules and more on real cases might help.
>>>>>> As stated before, you are more than welcome to *become* one of
>>>>>> these resources.
>>>>>>
>>>>>> Phil
>>>>>>> Regards, Dim. 
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>>
>>>> Dimitri Pourbaix                         *      Don't worry, be happy
>>>>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
>>>>>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
>>>>>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
>>>>>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
>>>>>>> Bruxelles                        *        NAC: HBZSC RG2Z6 
>>>>>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>>>>>> mailto:pourbaix@astro.ulb.ac.be
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>>
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/28/12 10:35 AM, Luc Maisonobe wrote:
> Le 28/12/2012 19:11, Phil Steitz a écrit :
>> On 12/28/12 9:58 AM, Luc Maisonobe wrote:
>>> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
>>>> Hi,
>>>>
>>>> I can understand Dimitri's frustration, it seems the optimization
>>>> framework gets worse with every iteration. However, we should
>>>> probably look forward and think about how to design it properly
>>>> instead.
>>>>
>>>> Several times I brought out some problems and ideas about the package
>>>> and it seems the only person who has an opinion is Gilles.
>>> Several people contributed to the thread (see
>>> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
>>> pointed out in one of the message, we lack an optimization expert. I
>>> sincerely would not want my opinion to be taken too seriously on this,
>>> so I only expressed what I could and did not decide anything by myself
>>> (only proposed to remove the wrong binding with DerivativeStructure,
>>> which has since been done).
>>>
>>>> I will list what I consider to be major problems
>>>> 1) The OO design is bad, too much inheritance which could be handled
>>>> better by interfaces, the structure has no relation to the actual way
>>>> parts of optimizers can be mixed and matched. Input functions should
>>>> also have clear inheritance structure and should return both the
>>>> value, gradient, hessian in one function call.
>>> I strongly agree with Konstantin here. Abstract classes allow to add
>>> methods without breaking compatibility (which is good), but they also
>>> have some drawbacks (we have seen one drawback with the parameterized
>>> classes a few months ago, and this huge hierarchy is another one). So
>>> there is no silver bullet and we keep trying to find the good balance.
>>> As far as I am concerned, I would prefer we get fewer abstract classes,
>>> we remove some intermediate level (I don't know which ones), and we use
>>> more delegation than inheritance.
>>>
>>>> 2) Incorrect handling of constraints. There are only something like 5
>>>> possible constraints possible in optimization, with each
>>>> implementation of the solver handling some but not all. There is no
>>>> need to this runtime approach, which creates incredible amount of
>>>> confusion. All the possible constraints should be explicitly given in
>>>> the parameters to a function call, there are only 5. In addition,
>>>> constraints should be pre-processed a priori! So they should be an
>>>> input to the constructor not to the optimization function call.
>>> Our implementation for constraints is really limited (once again, scarce
>>> resources). What are the 5 types you consider? Simple/double bounds on
>>> parameters, linear/non-linear bounds and equality?
>>>
>>>> 3) Linear algebra package should be used as an interface and
>>>> internally to increase performance for larger datasets. Data copying
>>>> should be avoided as much as possible.
>>> Yes, but this would require solving another sub-problem first: having a
>>> decent sparse linear algebra implementation, which we also lack. Our
>>> implementation for full matrices was also in a sorry state prior to 3.0,
>>> but now fortunately this has improved at least for systems up to a few
>>> thousands rows and columns (so at least we do make progress on some points).
>>>
>>>> 4) Testing should be done on larger problems.
>>> Yes. I guess there are some general well known problems for that, so we
>>> should get a few of them and implement them. We did implement a number
>>> of tests from Minpack, but they focused on difficult cases rather than
>>> on problem size. I think optimization has a good testing coverage, but
>>> clearly large size problems is a needed addition.
>>>
>>>> I know the response is that I am free to go implemented, but I think
>>>> we should at least agree on design principles instead of pushing
>>>> through your own ideas because the other person is too busy. The only
>>>> discussion we ever had on this was between me and Gilles, everyone
>>>> else disappeared.
>>> Well, we tried to keep up as our skills allowed, and we were also
>>> concerned with 3.1 being released at the same time.
>>>
>>> We have more time now than we had a few weeks ago. This is an
>>> opportunity to restart the discussion. We can refrain from pushing a new
>>> release (despite I would like this bug fix to be released officially)
>>> and take some time to think calmly. We could also push 3.2 with only the
>>> fix
>> What about 3.1.1 with just the fix for this, then possibly 3.2 or
>> direct to 4.0.
> As you want. I don't like adding too many sub-release numbers, just as
> if someone fears jumping to next version. I remember some Sun products
> and Perl versions that end up with something like 5 release digits.
> Well, I don't really care, so it can be 3.1.1 if people prefer that.

I would personally like to see us move in that direction.  Now that
Gilles has done the hard work to get us to a pretty well-automated
release process, it should not be that hard for us to crank
releases.  I have always admired the tomcat community's approach to
releases - full automation and regular dot releases.  This enables
them to fix issues found in releases quickly and keep progressing. 
Last I checked, they just used three levels.  The Commons versioning
guidelines allow and support this - first number is major (can break
compat) second is minor (can add new features and APIs compatibly),
third is just bug-fix.  Does the fix for the issue that started this
thread change the API?  If so, we would need to cut 3.2 in any case.

Phil
>
> Luc
>
>> Phil
>>>  and without any revamp and start thinking about 4.0 with a redesign
>>> of these two main area: optimization and sparse linear algebra.
>>>
>>> If you could contribute to this discussion understanding we are not
>>> experts of this field and we cannot do it by ourselves, it would be great.
>>>
>>> best regards,
>>> Luc
>>>
>>>> Thanks, Konstantin
>>>>
>>>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
>>>> wrote:
>>>>
>>>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>>>>> Luc,
>>>>>>
>>>>>>> So in order to make sure I understand your point, you would be
>>>>>>> OK if I deprecate the non-diagonal weights, in which case users
>>>>>>> needing this would have to implement it themselves by
>>>>>>> premultiplication (as both you and Konstantin seem to
>>>>>>> propose)?
>>>>>> Yes, exactly.
>>>>>>
>>>>>>> Sure, but for the record the feature was also a last minute 
>>>>>>> change. This was discussed on the list, and the final decision
>>>>>>> was to add this feature despite the release was close. No
>>>>>>> wonder we failed to test it thoroughsly.
>>>>>> Last minute?  I have been discussing this with Gilles for
>>>>>> several months.
>>>>> Relevant project discussion happens *on this list*
>>>>>>> We don't expect our releases to be perfect. We do our best,
>>>>>>> with the resources we have.
>>>>>> I perfectly understand this but focusing those resources less on 
>>>>>> rules and more on real cases might help.
>>>>> As stated before, you are more than welcome to *become* one of
>>>>> these resources.
>>>>>
>>>>> Phil
>>>>>> Regards, Dim. 
>>>>>> ----------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>
>>> Dimitri Pourbaix                         *      Don't worry, be happy
>>>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
>>>>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
>>>>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
>>>>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
>>>>>> Bruxelles                        *        NAC: HBZSC RG2Z6 
>>>>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>>>>> mailto:pourbaix@astro.ulb.ac.be
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>>
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>>
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 28/12/2012 19:11, Phil Steitz a écrit :
> On 12/28/12 9:58 AM, Luc Maisonobe wrote:
>> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
>>> Hi,
>>>
>>> I can understand Dimitri's frustration, it seems the optimization
>>> framework gets worse with every iteration. However, we should
>>> probably look forward and think about how to design it properly
>>> instead.
>>>
>>> Several times I brought out some problems and ideas about the package
>>> and it seems the only person who has an opinion is Gilles.
>> Several people contributed to the thread (see
>> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
>> pointed out in one of the message, we lack an optimization expert. I
>> sincerely would not want my opinion to be taken too seriously on this,
>> so I only expressed what I could and did not decide anything by myself
>> (only proposed to remove the wrong binding with DerivativeStructure,
>> which has since been done).
>>
>>> I will list what I consider to be major problems
>>> 1) The OO design is bad, too much inheritance which could be handled
>>> better by interfaces, the structure has no relation to the actual way
>>> parts of optimizers can be mixed and matched. Input functions should
>>> also have clear inheritance structure and should return both the
>>> value, gradient, hessian in one function call.
>> I strongly agree with Konstantin here. Abstract classes allow to add
>> methods without breaking compatibility (which is good), but they also
>> have some drawbacks (we have seen one drawback with the parameterized
>> classes a few months ago, and this huge hierarchy is another one). So
>> there is no silver bullet and we keep trying to find the good balance.
>> As far as I am concerned, I would prefer we get fewer abstract classes,
>> we remove some intermediate level (I don't know which ones), and we use
>> more delegation than inheritance.
>>
>>> 2) Incorrect handling of constraints. There are only something like 5
>>> possible constraints possible in optimization, with each
>>> implementation of the solver handling some but not all. There is no
>>> need to this runtime approach, which creates incredible amount of
>>> confusion. All the possible constraints should be explicitly given in
>>> the parameters to a function call, there are only 5. In addition,
>>> constraints should be pre-processed a priori! So they should be an
>>> input to the constructor not to the optimization function call.
>> Our implementation for constraints is really limited (once again, scarce
>> resources). What are the 5 types you consider? Simple/double bounds on
>> parameters, linear/non-linear bounds and equality?
>>
>>> 3) Linear algebra package should be used as an interface and
>>> internally to increase performance for larger datasets. Data copying
>>> should be avoided as much as possible.
>> Yes, but this would require solving another sub-problem first: having a
>> decent sparse linear algebra implementation, which we also lack. Our
>> implementation for full matrices was also in a sorry state prior to 3.0,
>> but now fortunately this has improved at least for systems up to a few
>> thousands rows and columns (so at least we do make progress on some points).
>>
>>> 4) Testing should be done on larger problems.
>> Yes. I guess there are some general well known problems for that, so we
>> should get a few of them and implement them. We did implement a number
>> of tests from Minpack, but they focused on difficult cases rather than
>> on problem size. I think optimization has a good testing coverage, but
>> clearly large size problems is a needed addition.
>>
>>> I know the response is that I am free to go implemented, but I think
>>> we should at least agree on design principles instead of pushing
>>> through your own ideas because the other person is too busy. The only
>>> discussion we ever had on this was between me and Gilles, everyone
>>> else disappeared.
>> Well, we tried to keep up as our skills allowed, and we were also
>> concerned with 3.1 being released at the same time.
>>
>> We have more time now than we had a few weeks ago. This is an
>> opportunity to restart the discussion. We can refrain from pushing a new
>> release (despite I would like this bug fix to be released officially)
>> and take some time to think calmly. We could also push 3.2 with only the
>> fix
> 
> What about 3.1.1 with just the fix for this, then possibly 3.2 or
> direct to 4.0.

As you want. I don't like adding too many sub-release numbers, just as
if someone fears jumping to next version. I remember some Sun products
and Perl versions that end up with something like 5 release digits.
Well, I don't really care, so it can be 3.1.1 if people prefer that.

Luc

> 
> Phil
>>  and without any revamp and start thinking about 4.0 with a redesign
>> of these two main area: optimization and sparse linear algebra.
>>
>> If you could contribute to this discussion understanding we are not
>> experts of this field and we cannot do it by ourselves, it would be great.
>>
>> best regards,
>> Luc
>>
>>> Thanks, Konstantin
>>>
>>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
>>> wrote:
>>>
>>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>>>> Luc,
>>>>>
>>>>>> So in order to make sure I understand your point, you would be
>>>>>> OK if I deprecate the non-diagonal weights, in which case users
>>>>>> needing this would have to implement it themselves by
>>>>>> premultiplication (as both you and Konstantin seem to
>>>>>> propose)?
>>>>> Yes, exactly.
>>>>>
>>>>>> Sure, but for the record the feature was also a last minute 
>>>>>> change. This was discussed on the list, and the final decision
>>>>>> was to add this feature despite the release was close. No
>>>>>> wonder we failed to test it thoroughsly.
>>>>> Last minute?  I have been discussing this with Gilles for
>>>>> several months.
>>>> Relevant project discussion happens *on this list*
>>>>>> We don't expect our releases to be perfect. We do our best,
>>>>>> with the resources we have.
>>>>> I perfectly understand this but focusing those resources less on 
>>>>> rules and more on real cases might help.
>>>> As stated before, you are more than welcome to *become* one of
>>>> these resources.
>>>>
>>>> Phil
>>>>> Regards, Dim. 
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>> Dimitri Pourbaix                         *      Don't worry, be happy
>>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
>>>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
>>>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
>>>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
>>>>> Bruxelles                        *        NAC: HBZSC RG2Z6 
>>>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>>>> mailto:pourbaix@astro.ulb.ac.be
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>>
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>>
>>>
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/28/12 9:58 AM, Luc Maisonobe wrote:
> Le 28/12/2012 17:51, Konstantin Berlin a écrit :
>> Hi,
>>
>> I can understand Dimitri's frustration, it seems the optimization
>> framework gets worse with every iteration. However, we should
>> probably look forward and think about how to design it properly
>> instead.
>>
>> Several times I brought out some problems and ideas about the package
>> and it seems the only person who has an opinion is Gilles.
> Several people contributed to the thread (see
> <http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
> pointed out in one of the message, we lack an optimization expert. I
> sincerely would not want my opinion to be taken too seriously on this,
> so I only expressed what I could and did not decide anything by myself
> (only proposed to remove the wrong binding with DerivativeStructure,
> which has since been done).
>
>> I will list what I consider to be major problems
>> 1) The OO design is bad, too much inheritance which could be handled
>> better by interfaces, the structure has no relation to the actual way
>> parts of optimizers can be mixed and matched. Input functions should
>> also have clear inheritance structure and should return both the
>> value, gradient, hessian in one function call.
> I strongly agree with Konstantin here. Abstract classes allow to add
> methods without breaking compatibility (which is good), but they also
> have some drawbacks (we have seen one drawback with the parameterized
> classes a few months ago, and this huge hierarchy is another one). So
> there is no silver bullet and we keep trying to find the good balance.
> As far as I am concerned, I would prefer we get fewer abstract classes,
> we remove some intermediate level (I don't know which ones), and we use
> more delegation than inheritance.
>
>> 2) Incorrect handling of constraints. There are only something like 5
>> possible constraints possible in optimization, with each
>> implementation of the solver handling some but not all. There is no
>> need to this runtime approach, which creates incredible amount of
>> confusion. All the possible constraints should be explicitly given in
>> the parameters to a function call, there are only 5. In addition,
>> constraints should be pre-processed a priori! So they should be an
>> input to the constructor not to the optimization function call.
> Our implementation for constraints is really limited (once again, scarce
> resources). What are the 5 types you consider? Simple/double bounds on
> parameters, linear/non-linear bounds and equality?
>
>> 3) Linear algebra package should be used as an interface and
>> internally to increase performance for larger datasets. Data copying
>> should be avoided as much as possible.
> Yes, but this would require solving another sub-problem first: having a
> decent sparse linear algebra implementation, which we also lack. Our
> implementation for full matrices was also in a sorry state prior to 3.0,
> but now fortunately this has improved at least for systems up to a few
> thousands rows and columns (so at least we do make progress on some points).
>
>> 4) Testing should be done on larger problems.
> Yes. I guess there are some general well known problems for that, so we
> should get a few of them and implement them. We did implement a number
> of tests from Minpack, but they focused on difficult cases rather than
> on problem size. I think optimization has a good testing coverage, but
> clearly large size problems is a needed addition.
>
>> I know the response is that I am free to go implemented, but I think
>> we should at least agree on design principles instead of pushing
>> through your own ideas because the other person is too busy. The only
>> discussion we ever had on this was between me and Gilles, everyone
>> else disappeared.
> Well, we tried to keep up as our skills allowed, and we were also
> concerned with 3.1 being released at the same time.
>
> We have more time now than we had a few weeks ago. This is an
> opportunity to restart the discussion. We can refrain from pushing a new
> release (despite I would like this bug fix to be released officially)
> and take some time to think calmly. We could also push 3.2 with only the
> fix

What about 3.1.1 with just the fix for this, then possibly 3.2 or
direct to 4.0.

Phil
>  and without any revamp and start thinking about 4.0 with a redesign
> of these two main area: optimization and sparse linear algebra.
>
> If you could contribute to this discussion understanding we are not
> experts of this field and we cannot do it by ourselves, it would be great.
>
> best regards,
> Luc
>
>> Thanks, Konstantin
>>
>> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
>> wrote:
>>
>>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>>> Luc,
>>>>
>>>>> So in order to make sure I understand your point, you would be
>>>>> OK if I deprecate the non-diagonal weights, in which case users
>>>>> needing this would have to implement it themselves by
>>>>> premultiplication (as both you and Konstantin seem to
>>>>> propose)?
>>>> Yes, exactly.
>>>>
>>>>> Sure, but for the record the feature was also a last minute 
>>>>> change. This was discussed on the list, and the final decision
>>>>> was to add this feature despite the release was close. No
>>>>> wonder we failed to test it thoroughsly.
>>>> Last minute?  I have been discussing this with Gilles for
>>>> several months.
>>> Relevant project discussion happens *on this list*
>>>>> We don't expect our releases to be perfect. We do our best,
>>>>> with the resources we have.
>>>> I perfectly understand this but focusing those resources less on 
>>>> rules and more on real cases might help.
>>> As stated before, you are more than welcome to *become* one of
>>> these resources.
>>>
>>> Phil
>>>> Regards, Dim. 
>>>> ----------------------------------------------------------------------------
>>>>
>>>>
>>>>
> Dimitri Pourbaix                         *      Don't worry, be happy
>>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
>>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
>>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
>>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
>>>> Bruxelles                        *        NAC: HBZSC RG2Z6 
>>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>>> mailto:pourbaix@astro.ulb.ac.be
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>>
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>>
>>>
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>>
>>
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <Lu...@free.fr>.

Le 28/12/2012 17:51, Konstantin Berlin a écrit :
> Hi,
> 
> I can understand Dimitri's frustration, it seems the optimization
> framework gets worse with every iteration. However, we should
> probably look forward and think about how to design it properly
> instead.
> 
> Several times I brought out some problems and ideas about the package
> and it seems the only person who has an opinion is Gilles.

Several people contributed to the thread (see
<http://commons.markmail.org/thread/i6klmc2ytflb6qnt>), but as Gilles
pointed out in one of the message, we lack an optimization expert. I
sincerely would not want my opinion to be taken too seriously on this,
so I only expressed what I could and did not decide anything by myself
(only proposed to remove the wrong binding with DerivativeStructure,
which has since been done).

> 
> I will list what I consider to be major problems

> 1) The OO design is bad, too much inheritance which could be handled
> better by interfaces, the structure has no relation to the actual way
> parts of optimizers can be mixed and matched. Input functions should
> also have clear inheritance structure and should return both the
> value, gradient, hessian in one function call.

I strongly agree with Konstantin here. Abstract classes allow to add
methods without breaking compatibility (which is good), but they also
have some drawbacks (we have seen one drawback with the parameterized
classes a few months ago, and this huge hierarchy is another one). So
there is no silver bullet and we keep trying to find the good balance.
As far as I am concerned, I would prefer we get fewer abstract classes,
we remove some intermediate level (I don't know which ones), and we use
more delegation than inheritance.

> 2) Incorrect handling of constraints. There are only something like 5
> possible constraints possible in optimization, with each
> implementation of the solver handling some but not all. There is no
> need to this runtime approach, which creates incredible amount of
> confusion. All the possible constraints should be explicitly given in
> the parameters to a function call, there are only 5. In addition,
> constraints should be pre-processed a priori! So they should be an
> input to the constructor not to the optimization function call.

Our implementation for constraints is really limited (once again, scarce
resources). What are the 5 types you consider? Simple/double bounds on
parameters, linear/non-linear bounds and equality?

> 3) Linear algebra package should be used as an interface and
> internally to increase performance for larger datasets. Data copying
> should be avoided as much as possible.

Yes, but this would require solving another sub-problem first: having a
decent sparse linear algebra implementation, which we also lack. Our
implementation for full matrices was also in a sorry state prior to 3.0,
but now fortunately this has improved at least for systems up to a few
thousands rows and columns (so at least we do make progress on some points).

> 4) Testing should be done on larger problems.

Yes. I guess there are some general well known problems for that, so we
should get a few of them and implement them. We did implement a number
of tests from Minpack, but they focused on difficult cases rather than
on problem size. I think optimization has a good testing coverage, but
clearly large size problems is a needed addition.

> 
> I know the response is that I am free to go implemented, but I think
> we should at least agree on design principles instead of pushing
> through your own ideas because the other person is too busy. The only
> discussion we ever had on this was between me and Gilles, everyone
> else disappeared.

Well, we tried to keep up as our skills allowed, and we were also
concerned with 3.1 being released at the same time.

We have more time now than we had a few weeks ago. This is an
opportunity to restart the discussion. We can refrain from pushing a new
release (despite I would like this bug fix to be released officially)
and take some time to think calmly. We could also push 3.2 with only the
fix and without any revamp and start thinking about 4.0 with a redesign
of these two main area: optimization and sparse linear algebra.

If you could contribute to this discussion understanding we are not
experts of this field and we cannot do it by ourselves, it would be great.

best regards,
Luc

> 
> Thanks, Konstantin
> 
> On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com>
> wrote:
> 
>> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>>> Luc,
>>> 
>>>> So in order to make sure I understand your point, you would be
>>>> OK if I deprecate the non-diagonal weights, in which case users
>>>> needing this would have to implement it themselves by
>>>> premultiplication (as both you and Konstantin seem to
>>>> propose)?
>>> 
>>> Yes, exactly.
>>> 
>>>> Sure, but for the record the feature was also a last minute 
>>>> change. This was discussed on the list, and the final decision
>>>> was to add this feature despite the release was close. No
>>>> wonder we failed to test it thoroughsly.
>>> 
>>> Last minute?  I have been discussing this with Gilles for
>>> several months.
>> 
>> Relevant project discussion happens *on this list*
>>> 
>>>> We don't expect our releases to be perfect. We do our best,
>>>> with the resources we have.
>>> 
>>> I perfectly understand this but focusing those resources less on 
>>> rules and more on real cases might help.
>> 
>> As stated before, you are more than welcome to *become* one of
>> these resources.
>> 
>> Phil
>>> 
>>> Regards, Dim. 
>>> ----------------------------------------------------------------------------
>>>
>>>
>>> 
Dimitri Pourbaix                         *      Don't worry, be happy
>>> Institut d'Astronomie et d'Astrophysique *         and CARPE
>>> DIEM. CP 226, office 2.N4.211, building NO     * Universite Libre
>>> de Bruxelles            *      Tel : +32-2-650.35.71 Boulevard du
>>> Triomphe                    *      Fax : +32-2-650.42.26 B-1050
>>> Bruxelles                        *        NAC: HBZSC RG2Z6 
>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>> mailto:pourbaix@astro.ulb.ac.be
>>> 
>>> ---------------------------------------------------------------------
>>>
>>> 
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>> 
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>>
>> 
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
>
> 
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Hi,

I can understand Dimitri's frustration, it seems the optimization framework gets worse with every iteration. However, we should probably look forward and think about how to design it properly instead. 

Several times I brought out some problems and ideas about the package and it seems the only person who has an opinion is Gilles.

I will list what I consider to be major problems
1) The OO design is bad, too much inheritance which could be handled better by interfaces, the structure has no relation to the actual way parts of optimizers can be mixed and matched. Input functions should also have clear inheritance structure and should return both the value, gradient, hessian in one function call.
2) Incorrect handling of constraints. There are only something like 5 possible constraints possible in optimization, with each implementation of the solver handling some but not all. There is no need to this runtime approach, which creates incredible amount of confusion. All the possible constraints should be explicitly given in the parameters to a function call, there are only 5. In addition, constraints should be pre-processed a priori! So they should be an input to the constructor not to the optimization function call.
3) Linear algebra package should be used as an interface and internally to increase performance for larger datasets. Data copying should be avoided as much as possible.
4) Testing should be done on larger problems.

I know the response is that I am free to go implemented, but I think we should at least agree on design principles instead of pushing through your own ideas because the other person is too busy. The only discussion we ever had on this was between me and Gilles, everyone else disappeared. 

Thanks,
Konstantin

On Dec 28, 2012, at 11:27 AM, Phil Steitz <ph...@gmail.com> wrote:

> On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
>> Luc,
>> 
>>> So in order to make sure I understand your point, you would be OK
>>> if I
>>> deprecate the non-diagonal weights, in which case users needing this
>>> would have to implement it themselves by premultiplication (as
>>> both you
>>> and Konstantin seem to propose)?
>> 
>> Yes, exactly.
>> 
>>> Sure, but for the record the feature was also a last minute
>>> change. This
>>> was discussed on the list, and the final decision was to add this
>>> feature despite the release was close. No wonder we failed to
>>> test it
>>> thoroughsly.
>> 
>> Last minute?  I have been discussing this with Gilles for several
>> months.
> 
> Relevant project discussion happens *on this list*
>> 
>>> We don't expect our releases to be perfect. We do our best, with the
>>> resources we have.
>> 
>> I perfectly understand this but focusing those resources less on
>> rules
>> and more on real cases might help.
> 
> As stated before, you are more than welcome to *become* one of these
> resources.
> 
> Phil
>> 
>> Regards,
>> Dim.
>> ----------------------------------------------------------------------------
>> 
>> Dimitri Pourbaix                         *      Don't worry, be happy
>> Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
>> CP 226, office 2.N4.211, building NO     *
>> Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
>> Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
>> B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
>> http://sb9.astro.ulb.ac.be/~pourbaix     *
>> mailto:pourbaix@astro.ulb.ac.be
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/28/12 8:12 AM, Dimitri Pourbaix wrote:
> Luc,
>
>> So in order to make sure I understand your point, you would be OK
>> if I
>> deprecate the non-diagonal weights, in which case users needing this
>> would have to implement it themselves by premultiplication (as
>> both you
>> and Konstantin seem to propose)?
>
> Yes, exactly.
>
>> Sure, but for the record the feature was also a last minute
>> change. This
>> was discussed on the list, and the final decision was to add this
>> feature despite the release was close. No wonder we failed to
>> test it
>> thoroughsly.
>
> Last minute?  I have been discussing this with Gilles for several
> months.

Relevant project discussion happens *on this list*
>
>> We don't expect our releases to be perfect. We do our best, with the
>> resources we have.
>
> I perfectly understand this but focusing those resources less on
> rules
> and more on real cases might help.

As stated before, you are more than welcome to *become* one of these
resources.

Phil
>
> Regards,
>  Dim.
> ----------------------------------------------------------------------------
>
> Dimitri Pourbaix                         *      Don't worry, be happy
> Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
> CP 226, office 2.N4.211, building NO     *
> Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
> Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
>  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
> http://sb9.astro.ulb.ac.be/~pourbaix     *
> mailto:pourbaix@astro.ulb.ac.be
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Luc,

> So in order to make sure I understand your point, you would be OK if I
> deprecate the non-diagonal weights, in which case users needing this
> would have to implement it themselves by premultiplication (as both you
> and Konstantin seem to propose)?

Yes, exactly.

> Sure, but for the record the feature was also a last minute change. This
> was discussed on the list, and the final decision was to add this
> feature despite the release was close. No wonder we failed to test it
> thoroughsly.

Last minute?  I have been discussing this with Gilles for several months.

> We don't expect our releases to be perfect. We do our best, with the
> resources we have.

I perfectly understand this but focusing those resources less on rules
and more on real cases might help.

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Luc,

Agreed. Dimitri brought out a good point about correlated errors that I didn't think about. All these cases can be handled by user modifying his/her input, but it would be nice to think about how this can be supported in the future. I think this problem showed up for two general reasons, which I brought out before.

1) We are not testing our optimization classes on large datasets.
2) The linear algebra framework should be used internally, that way we can leverage sparse matrix operations. 

-Konstantin

On Dec 28, 2012, at 10:48 AM, Luc Maisonobe <lu...@spaceroots.org> wrote:

> Le 28/12/2012 16:08, Dimitri Pourbaix a écrit :
>> Luc,
>> 
>>> However, it is also possible to set a non-diagonal weight matrix, and
>>> one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
>>> on the matrix to extract its square root. I don't know any use case for
>>> this, but it has been set up this way, so I guess someone has a use for
>>> non-diagonal weights.
>> 
>> Such a situation occurs when observations are correlated.  That is
>> actually the most general expression for a least square problem.
>> 
>>> I wonder if I should simply add this as is or if we should rather remove
>>> the non-diagonal weights feature and support only vector weights.
>> 
>> Even if a vector of weights is convenient, it would only cover a subset
>> of situations.  However, even a vector of weights is not needed if both
>> the models and the observations are pre-multiplied by the square root
>> of their weight.  By the way, I remind you that those weights already
>> caused some bugs in the 2.0 release.
>> 
>> Personnally, I could live with a vector form.
> 
> So in order to make sure I understand your point, you would be OK if I
> deprecate the non-diagonal weights, in which case users needing this
> would have to implement it themselves by premultiplication (as both you
> and Konstantin seem to propose)?
> 
>> 
>> As a more general comment, I find it amazing that all the +1 for the
>> release were only concerned by the compliance with (commons) rules,
>> configuration files, ... Just 4 days after the release, you suddenly
>> figure out that a user is in trouble and you want a quick fix.  Maybe
>> such a test would have been need BEFORE the release!
> 
> Sure, but for the record the feature was also a last minute change. This
> was discussed on the list, and the final decision was to add this
> feature despite the release was close. No wonder we failed to test it
> thoroughsly.
> 
> We don't expect our releases to be perfect. We do our best, with the
> resources we have.
> 
>> 
>> Regards,
>> Dim.
> 
> 
> Le 28/12/2012 16:13, Konstantin Berlin a écrit :
>> Hi,
>> 
>> I am not clear on the use of weights in a non-least squares problem
>> is. In a least-squares problem the weights can simply be taken in by
>> the user in their input function. So the weights option is a
>> convenience feature, any non-standard matrix weights could be
>> programmed in by the user, if they need to. I have a hard time
>> imagining a case when the weights are non-diagnal. I think your idea
>> is good and you should go on with it. The quadratic memory scaling
>> with the number of data points is not acceptable.
> 
> Thanks Konstantin. So if Dimitri confirm I understood his point
> correctly, I will proceed with deprecating the Weight class and
> introduce a WeightVector class, understanding it is a convenience
> feature suited for simple non-correlated observations.
> 
>> 
>> -Konstantin
> 
> best regards,
> Luc
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.

On Fri, Dec 28, 2012 at 04:48:59PM +0100, Luc Maisonobe wrote:
> Le 28/12/2012 16:08, Dimitri Pourbaix a écrit :
> > Luc,
> > 
> >> However, it is also possible to set a non-diagonal weight matrix, and
> >> one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
> >> on the matrix to extract its square root. I don't know any use case for
> >> this, but it has been set up this way, so I guess someone has a use for
> >> non-diagonal weights.
> > 
> > Such a situation occurs when observations are correlated.  That is
> > actually the most general expression for a least square problem.
> > 
> >> I wonder if I should simply add this as is or if we should rather remove
> >> the non-diagonal weights feature and support only vector weights.
> > 
> > Even if a vector of weights is convenient, it would only cover a subset
> > of situations.  However, even a vector of weights is not needed if both
> > the models and the observations are pre-multiplied by the square root
> > of their weight.  By the way, I remind you that those weights already
> > caused some bugs in the 2.0 release.
> > 
> > Personnally, I could live with a vector form.
> 
> So in order to make sure I understand your point, you would be OK if I
> deprecate the non-diagonal weights, in which case users needing this
> would have to implement it themselves by premultiplication (as both you
> and Konstantin seem to propose)?
> 
> > 
> > As a more general comment, I find it amazing that all the +1 for the
> > release were only concerned by the compliance with (commons) rules,
> > configuration files, ... Just 4 days after the release, you suddenly
> > figure out that a user is in trouble and you want a quick fix.  Maybe
> > such a test would have been need BEFORE the release!
> 
> Sure, but for the record the feature was also a last minute change. This
> was discussed on the list, and the final decision was to add this
> feature despite the release was close. No wonder we failed to test it
> thoroughsly.

I'm responsible for the new feature that turned to be a problematic change
but

  2 months ago != last minute

[And we were also aware that CM lacks performance testing. And that the
developers' respective use-cases were on datasets with a small number of
observations.]

That same problem is also present in the deprecated "optimization" package.

> We don't expect our releases to be perfect. We do our best, with the
> resources we have.
> 
> > 
> > Regards,
> >  Dim.
> 
> 
> Le 28/12/2012 16:13, Konstantin Berlin a écrit :
> > Hi,
> >
> > I am not clear on the use of weights in a non-least squares problem
> > is. In a least-squares problem the weights can simply be taken in by
> > the user in their input function. So the weights option is a
> > convenience feature, any non-standard matrix weights could be
> > programmed in by the user, if they need to. I have a hard time
> > imagining a case when the weights are non-diagnal. I think your idea
> > is good and you should go on with it. The quadratic memory scaling
> > with the number of data points is not acceptable.
> 
> Thanks Konstantin. So if Dimitri confirm I understood his point
> correctly, I will proceed with deprecating the Weight class and
> introduce a WeightVector class, understanding it is a convenience
> feature suited for simple non-correlated observations.

-1
[Cf. previous post.]


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Luc Maisonobe <lu...@spaceroots.org>.

Le 28/12/2012 16:08, Dimitri Pourbaix a écrit :
> Luc,
> 
>> However, it is also possible to set a non-diagonal weight matrix, and
>> one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
>> on the matrix to extract its square root. I don't know any use case for
>> this, but it has been set up this way, so I guess someone has a use for
>> non-diagonal weights.
> 
> Such a situation occurs when observations are correlated.  That is
> actually the most general expression for a least square problem.
> 
>> I wonder if I should simply add this as is or if we should rather remove
>> the non-diagonal weights feature and support only vector weights.
> 
> Even if a vector of weights is convenient, it would only cover a subset
> of situations.  However, even a vector of weights is not needed if both
> the models and the observations are pre-multiplied by the square root
> of their weight.  By the way, I remind you that those weights already
> caused some bugs in the 2.0 release.
> 
> Personnally, I could live with a vector form.

So in order to make sure I understand your point, you would be OK if I
deprecate the non-diagonal weights, in which case users needing this
would have to implement it themselves by premultiplication (as both you
and Konstantin seem to propose)?

> 
> As a more general comment, I find it amazing that all the +1 for the
> release were only concerned by the compliance with (commons) rules,
> configuration files, ... Just 4 days after the release, you suddenly
> figure out that a user is in trouble and you want a quick fix.  Maybe
> such a test would have been need BEFORE the release!

Sure, but for the record the feature was also a last minute change. This
was discussed on the list, and the final decision was to add this
feature despite the release was close. No wonder we failed to test it
thoroughsly.

We don't expect our releases to be perfect. We do our best, with the
resources we have.

> 
> Regards,
>  Dim.


Le 28/12/2012 16:13, Konstantin Berlin a écrit :
> Hi,
>
> I am not clear on the use of weights in a non-least squares problem
> is. In a least-squares problem the weights can simply be taken in by
> the user in their input function. So the weights option is a
> convenience feature, any non-standard matrix weights could be
> programmed in by the user, if they need to. I have a hard time
> imagining a case when the weights are non-diagnal. I think your idea
> is good and you should go on with it. The quadratic memory scaling
> with the number of data points is not acceptable.

Thanks Konstantin. So if Dimitri confirm I understood his point
correctly, I will proceed with deprecating the Weight class and
introduce a WeightVector class, understanding it is a convenience
feature suited for simple non-correlated observations.

>
> -Konstantin

best regards,
Luc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Phil,

> You are certainly welcome to help verify future release candidates,
> as well as the commits that go into them.

I did not wait for your kind offer.  Gilles got my comments on features I
do use everyday on a huge dataset.

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Phil Steitz <ph...@gmail.com>.

On 12/28/12 7:08 AM, Dimitri Pourbaix wrote:
> Luc,
>
>> However, it is also possible to set a non-diagonal weight matrix,
>> and
>> one class (AbstractLeastSquaresOptimizer) performs an eigen
>> dcomposition
>> on the matrix to extract its square root. I don't know any use
>> case for
>> this, but it has been set up this way, so I guess someone has a
>> use for
>> non-diagonal weights.
>
> Such a situation occurs when observations are correlated.  That is
> actually the most general expression for a least square problem.
>
>> I wonder if I should simply add this as is or if we should rather
>> remove
>> the non-diagonal weights feature and support only vector weights.
>
> Even if a vector of weights is convenient, it would only cover a
> subset
> of situations.  However, even a vector of weights is not needed if
> both
> the models and the observations are pre-multiplied by the square root
> of their weight.  By the way, I remind you that those weights already
> caused some bugs in the 2.0 release.
>
> Personnally, I could live with a vector form.
>
> As a more general comment, I find it amazing that all the +1 for the
> release were only concerned by the compliance with (commons) rules,
> configuration files, ... Just 4 days after the release, you suddenly
> figure out that a user is in trouble and you want a quick fix.  Maybe
> such a test would have been need BEFORE the release!

You are certainly welcome to help verify future release candidates,
as well as the commits that go into them.

Phil
>
> Regards,
>  Dim.
> ----------------------------------------------------------------------------
>
> Dimitri Pourbaix                         *      Don't worry, be happy
> Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
> CP 226, office 2.N4.211, building NO     *
> Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
> Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
>  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
> http://sb9.astro.ulb.ac.be/~pourbaix     *
> mailto:pourbaix@astro.ulb.ac.be
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.

Luc,

> However, it is also possible to set a non-diagonal weight matrix, and
> one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
> on the matrix to extract its square root. I don't know any use case for
> this, but it has been set up this way, so I guess someone has a use for
> non-diagonal weights.

Such a situation occurs when observations are correlated.  That is
actually the most general expression for a least square problem.

> I wonder if I should simply add this as is or if we should rather remove
> the non-diagonal weights feature and support only vector weights.

Even if a vector of weights is convenient, it would only cover a subset
of situations.  However, even a vector of weights is not needed if both
the models and the observations are pre-multiplied by the square root
of their weight.  By the way, I remind you that those weights already
caused some bugs in the 2.0 release.

Personnally, I could live with a vector form.

As a more general comment, I find it amazing that all the +1 for the
release were only concerned by the compliance with (commons) rules,
configuration files, ... Just 4 days after the release, you suddenly
figure out that a user is in trouble and you want a quick fix.  Maybe
such a test would have been need BEFORE the release!

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *      Don't worry, be happy
Institut d'Astronomie et d'Astrophysique *         and CARPE DIEM.
CP 226, office 2.N4.211, building NO     *
Universite Libre de Bruxelles            *      Tel : +32-2-650.35.71
Boulevard du Triomphe                    *      Fax : +32-2-650.42.26
  B-1050 Bruxelles                        *        NAC: HBZSC RG2Z6
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [math] major problem with new released version 3.1

Posted by Konstantin Berlin <kb...@gmail.com>.

Hi,

I am not clear on the use of weights in a non-least squares problem is. In a least-squares problem the weights can simply be taken in by the user in their input function. So the weights option is a convenience feature, any non-standard matrix weights could be programmed in by the user, if they need to. I have a hard time imagining a case when the weights are non-diagnal. I think your idea is good and you should go on with it. The quadratic memory scaling with the number of data points is not acceptable. 

-Konstantin

On Dec 28, 2012, at 9:01 AM, Luc Maisonobe <lu...@spaceroots.org> wrote:

> Hi all,
> 
> I have encountered a major problem with version 3.1 just released.
> This problem occurs in the revamped optimizers. One of the
> OptimizationData implementations we can pass to multivariate vector
> optimizers is the weight of the various observations.
> 
> In all cases I can think of, this weight is a vector, but it is in fact
> stored as a matrix in which only the diagonal is set. It's main
> constructor uses a simple double array (which is used in all test cases
> and is also used under the hood by curve fitters), and some optimizers
> also assume only the diagonal is set (for example the Gauss-Newton
> optimizer extract only the diagonal elements and ignores the other elements.
> 
> However, it is also possible to set a non-diagonal weight matrix, and
> one class (AbstractLeastSquaresOptimizer) performs an eigen dcomposition
> on the matrix to extract its square root. I don't know any use case for
> this, but it has been set up this way, so I guess someone has a use for
> non-diagonal weights.
> 
> However, merging all cases (vector and matrix) into one (despite the
> fact Gauss-Newton ignores non-diagonal elements) leads to huge problems.
> I discovered this when attempting a simple polynomial curve fitting on a
> large number of points (41201 points). In this case, a full weight
> matrix with about 1.7 billions entries was created, and copied... I
> ended up with a memory error for something that should have been
> completely trivial (fitting a polynomial, when in fact all weights were
> equal to 1.0)! So for now, the new curve fitter is really impossible to
> use with a large number of points.
> 
> So I would like to be able to cope with vector-only weight easily. As I
> don't understand if matrix weight are useful or not, I have simply added
> a new WeightVector class implementing OptimizationData and added it to
> the various parseOptimizationData, while preserving the former matrix
> weights class. This does work well and changing the optimizers to
> directly use a vector was really straightforward as in fact the weights
> are not really used as a matrix (except in the strange eigen
> decomposition and associated Jacobian multiplication of
> AbstractLeastSquaresOptimizer).
> 
> I wonder if I should simply add this as is or if we should rather remove
> the non-diagonal weights feature and support only vector weights.
> 
> I also wonder if we should release a bug fix version early.
> 
> best regards,
> Luc
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org