You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Sébastien Brisard <se...@m4x.org> on 2012/03/29 07:58:17 UTC

[math] Refactoring the matrix and vector interfaces

Hello,
as agreed, I've started a JIRA ticket on this long-standing issue (see
MATH-765). This ticket is really meant as a summary of the discussions
which should take place on the mailing list, so please refrain from
adding comments (unless absolutely necessary). Children tickets will
be linked to this parent ticket once we agree on concrete tasks.

At the moment, there is nothing really fancy on this ticket. I merely
compared the interfaces of RealMatrix and RealVector, and drew a few
conclusions. I would be interested by your feed-back on the comments
I've made.

My first suggestion would be on the visitor design pattern vs.
map(UnivariateFunction). The former is specified in the RealMatrix
interface, the latter is specified in the RealVector abstract class. I
think both concepts are similar, and both are useful:
  - visitors know about the cell they are visiting,
  - map() doesn't.
Maybe it would be nice as a first step to unify these concepts. Two
options there
1. Specify both in both interfaces,
2. Specify only the visitor design pattern, and create a factory which
would return a visitor from a UnivariateFunction (ignoring the indices
of the current cell).

A second step could then be to remove most of the norm calculations
from the matrix and vector interfaces, and implement these
functionalities as visitors.

Other major issues are
  - tagging interfaces/marker methods to dispatch the objects to the
most optimized algorithm (e.g. multiplication of sparse, symmetric,
and so on matrices). See
http://markmail.org/thread/vkwe5x2jtozcjkge
http://markmail.org/thread/j4xjdtchpw33xpgr
  - implementation of "views" of a matrix/vector. IIRC, Ted suggested
such a feature a while ago. This would be a very useful feature (just
like a[3:5] in matlab, octave, python and the likes). This I think
would also be a very useful addition.

These are just a few thoughts, I'm curious to read what everyone
thinks. Thanks beforehand for your feedback!

Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Refactoring the matrix and vector interfaces

Posted by Sébastien Brisard <se...@m4x.org>.
Hello,

2012/3/30 Gilles Sadowski <gi...@harfang.homelinux.org>:
> Hello.
>
>>
>> >
>> > Although it is nice to work on a new design, I now have a problem with the
>> > issue as a whole: Where are the people (and applications) that need this?
>> > At least four persons made suggestions/comments on what was maybe wrong or
>> > could be improved, but they either are not active developers, or are not
>> > even posting to the ML anymore.
>> >
>> Fair enough. I guess the amount of interest raised by this thread will
>> reveal how "urgent" this issue actually is...
>>
>> >
>> > Thus we lack the potential "real-life" use-cases (plural) that should guide
>> > the refactoring.
>> > Certainly, some rationalizing can be performed for its own sake (like some of
>> > your proposals below) but a thorough redesign runs the risk of being not
>> > vastly better that what we current have. It's not the problem that we could
>> > be wrong (that's allowed, I think ;-), and we'll just re-redesign later on),
>> > but it's an issue of priority.
>> >
>> > Is it wise to spend time working on something that nobody really needs
>> > (judging from the number of comments which your otherwise valuable review
>> > work has brought to this ML)?
>> >
>> > Thus...
>> >
>> Point taken!
>>
>> >
>> >>
>> >> My first suggestion would be on the visitor design pattern vs.
>> >> map(UnivariateFunction). The former is specified in the RealMatrix
>> >> interface, the latter is specified in the RealVector abstract class. I
>> >> think both concepts are similar, and both are useful:
>> >>   - visitors know about the cell they are visiting,
>> >>   - map() doesn't.
>> >> Maybe it would be nice as a first step to unify these concepts. Two
>> >> options there
>> >> 1. Specify both in both interfaces,
>> >> 2. Specify only the visitor design pattern, and create a factory which
>> >> would return a visitor from a UnivariateFunction (ignoring the indices
>> >> of the current cell).
>> >
>> > ... this is a well-defined (limited in scope) rationalization.
>> > [And I'd vote for option 2.]
>> > But ...
>> >
>> I can start a new thread/JIRA ticket on this particular issue.
>>
>> >
>> >> A second step could then be to remove most of the norm calculations
>> >> from the matrix and vector interfaces, and implement these
>> >> functionalities as visitors.
>> >>
>> >> Other major issues are
>> >>   - tagging interfaces/marker methods to dispatch the objects to the
>> >> most optimized algorithm (e.g. multiplication of sparse, symmetric,
>> >> and so on matrices). See
>> >> http://markmail.org/thread/vkwe5x2jtozcjkge
>> >> http://markmail.org/thread/j4xjdtchpw33xpgr
>> >>   - implementation of "views" of a matrix/vector. IIRC, Ted suggested
>> >> such a feature a while ago. This would be a very useful feature (just
>> >> like a[3:5] in matlab, octave, python and the likes). This I think
>> >> would also be a very useful addition.
>> >
>> > ... this seems just "nice" (i.e. not justified by the advertised uses of
>> > CM).
>> >
>> While I agree with you on the tagging interfaces/marker methods issue,
>> I think that views would really be a useful addition. There must be a
>> reason why the ":" operator is implemented in matlab, python + scipy +
>> numpy, R, octave, and so on... I can provide you with a particular use
>> case I'm meeting constantly those days.
>
> If you know how it must implemented, then fine.
> IIUC, a "view" is linked to (a part of )the underlying data: Modifying the
> view in reflected in the original data.
>
Absolutely

>
> Thus this feature must be implemented in a specifc way for each matrix
> implementation. If so, this entails a lot of changes (in order that the
> feature is available at the "RealMatrix" interface level).
>
Yes, that's the downside. Maybe it's not as urgent as other issues. So
maybe this will be something I'll keep thinking of, and develop at a
slower rate. I do think it would be a very nice feature.

>> >
>> > Unless those people who, some time ago, expressed a willingness to
>> > contribute to the matrix interfaces come back, I'd much prefer to devote
>> > some time to get "BOBYQAOptimizer" in a better shape (Java-wise).[1] By
>> > the way, since there are many computations there that deal with matrices,
>> > it could also give some clues as to what is missing in the CM matrix
>> > functionality in order to improve another part of CM.
>> >
>> Unfortunately, I'll be pretty useless on this issue, as my background
>> in advanced numerical optimization is pretty thin...
>
> So is mine.
> The issue is not so much one of optimization algorithms but rather of code
> cleanup (rationalization, style, performance, and all that...).
>
> The big problem is that the "BOBYQAOptimizer" code is huge. And this is
> partly due to all the matrix computations being done explicitly (i.e. with
> loops) instead of calling methods from the matrix and vector interfaces.
> Among other things, some computations could not be done as simple method
> calls (like "matrix.multiply(vector)") but rather needed some combination of
> "map" and "visitor". [There was a thread about that, a few months ago.]
> IMO, one of the goal of the ticket you'll open about map/visitor revamping
> should be to be able to perform each of the matrix manipulations that appear
> in "BOBYQAOptimizer" as a single method call.
> As a "real-life" use-case, it's as good as can be.
>
This sounds both frightening and exciting. So I'll *visit* this code
with the visitor pattern in mind ;-).

>>
>> To sum up: if this thread does not raise a major discussion, maybe we
>> will stick with tiny API changes. I do think that RealMatrix and
>> RealVector should be as similar as possible, and would like to improve
>> the APIs in this direction. It was never my intention to rewrite the
>> whole thing anyway.
>> What would you think of this reduced scope?
>
> Always +1 for improving consistency. That makes it much easier to change
> the design afterwards.
>
Right. I'll revisit the JIRA ticket from this perspective.
Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Refactoring the matrix and vector interfaces

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
Hello.

> 
> >
> > Although it is nice to work on a new design, I now have a problem with the
> > issue as a whole: Where are the people (and applications) that need this?
> > At least four persons made suggestions/comments on what was maybe wrong or
> > could be improved, but they either are not active developers, or are not
> > even posting to the ML anymore.
> >
> Fair enough. I guess the amount of interest raised by this thread will
> reveal how "urgent" this issue actually is...
> 
> >
> > Thus we lack the potential "real-life" use-cases (plural) that should guide
> > the refactoring.
> > Certainly, some rationalizing can be performed for its own sake (like some of
> > your proposals below) but a thorough redesign runs the risk of being not
> > vastly better that what we current have. It's not the problem that we could
> > be wrong (that's allowed, I think ;-), and we'll just re-redesign later on),
> > but it's an issue of priority.
> >
> > Is it wise to spend time working on something that nobody really needs
> > (judging from the number of comments which your otherwise valuable review
> > work has brought to this ML)?
> >
> > Thus...
> >
> Point taken!
> 
> >
> >>
> >> My first suggestion would be on the visitor design pattern vs.
> >> map(UnivariateFunction). The former is specified in the RealMatrix
> >> interface, the latter is specified in the RealVector abstract class. I
> >> think both concepts are similar, and both are useful:
> >>   - visitors know about the cell they are visiting,
> >>   - map() doesn't.
> >> Maybe it would be nice as a first step to unify these concepts. Two
> >> options there
> >> 1. Specify both in both interfaces,
> >> 2. Specify only the visitor design pattern, and create a factory which
> >> would return a visitor from a UnivariateFunction (ignoring the indices
> >> of the current cell).
> >
> > ... this is a well-defined (limited in scope) rationalization.
> > [And I'd vote for option 2.]
> > But ...
> >
> I can start a new thread/JIRA ticket on this particular issue.
> 
> >
> >> A second step could then be to remove most of the norm calculations
> >> from the matrix and vector interfaces, and implement these
> >> functionalities as visitors.
> >>
> >> Other major issues are
> >>   - tagging interfaces/marker methods to dispatch the objects to the
> >> most optimized algorithm (e.g. multiplication of sparse, symmetric,
> >> and so on matrices). See
> >> http://markmail.org/thread/vkwe5x2jtozcjkge
> >> http://markmail.org/thread/j4xjdtchpw33xpgr
> >>   - implementation of "views" of a matrix/vector. IIRC, Ted suggested
> >> such a feature a while ago. This would be a very useful feature (just
> >> like a[3:5] in matlab, octave, python and the likes). This I think
> >> would also be a very useful addition.
> >
> > ... this seems just "nice" (i.e. not justified by the advertised uses of
> > CM).
> >
> While I agree with you on the tagging interfaces/marker methods issue,
> I think that views would really be a useful addition. There must be a
> reason why the ":" operator is implemented in matlab, python + scipy +
> numpy, R, octave, and so on... I can provide you with a particular use
> case I'm meeting constantly those days.

If you know how it must implemented, then fine.
IIUC, a "view" is linked to (a part of )the underlying data: Modifying the
view in reflected in the original data.
Thus this feature must be implemented in a specifc way for each matrix
implementation. If so, this entails a lot of changes (in order that the
feature is available at the "RealMatrix" interface level).

> >
> > Unless those people who, some time ago, expressed a willingness to
> > contribute to the matrix interfaces come back, I'd much prefer to devote
> > some time to get "BOBYQAOptimizer" in a better shape (Java-wise).[1] By
> > the way, since there are many computations there that deal with matrices,
> > it could also give some clues as to what is missing in the CM matrix
> > functionality in order to improve another part of CM.
> >
> Unfortunately, I'll be pretty useless on this issue, as my background
> in advanced numerical optimization is pretty thin...

So is mine.
The issue is not so much one of optimization algorithms but rather of code
cleanup (rationalization, style, performance, and all that...).

The big problem is that the "BOBYQAOptimizer" code is huge. And this is
partly due to all the matrix computations being done explicitly (i.e. with
loops) instead of calling methods from the matrix and vector interfaces.
Among other things, some computations could not be done as simple method
calls (like "matrix.multiply(vector)") but rather needed some combination of
"map" and "visitor". [There was a thread about that, a few months ago.]
IMO, one of the goal of the ticket you'll open about map/visitor revamping
should be to be able to perform each of the matrix manipulations that appear
in "BOBYQAOptimizer" as a single method call.
As a "real-life" use-case, it's as good as can be.

> 
> To sum up: if this thread does not raise a major discussion, maybe we
> will stick with tiny API changes. I do think that RealMatrix and
> RealVector should be as similar as possible, and would like to improve
> the APIs in this direction. It was never my intention to rewrite the
> whole thing anyway.
> What would you think of this reduced scope?

Always +1 for improving consistency. That makes it much easier to change
the design afterwards.


Best regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Refactoring the matrix and vector interfaces

Posted by Sébastien Brisard <se...@m4x.org>.
Hi Gilles,

>
> Although it is nice to work on a new design, I now have a problem with the
> issue as a whole: Where are the people (and applications) that need this?
> At least four persons made suggestions/comments on what was maybe wrong or
> could be improved, but they either are not active developers, or are not
> even posting to the ML anymore.
>
Fair enough. I guess the amount of interest raised by this thread will
reveal how "urgent" this issue actually is...

>
> Thus we lack the potential "real-life" use-cases (plural) that should guide
> the refactoring.
> Certainly, some rationalizing can be performed for its own sake (like some of
> your proposals below) but a thorough redesign runs the risk of being not
> vastly better that what we current have. It's not the problem that we could
> be wrong (that's allowed, I think ;-), and we'll just re-redesign later on),
> but it's an issue of priority.
>
> Is it wise to spend time working on something that nobody really needs
> (judging from the number of comments which your otherwise valuable review
> work has brought to this ML)?
>
> Thus...
>
Point taken!

>
>>
>> My first suggestion would be on the visitor design pattern vs.
>> map(UnivariateFunction). The former is specified in the RealMatrix
>> interface, the latter is specified in the RealVector abstract class. I
>> think both concepts are similar, and both are useful:
>>   - visitors know about the cell they are visiting,
>>   - map() doesn't.
>> Maybe it would be nice as a first step to unify these concepts. Two
>> options there
>> 1. Specify both in both interfaces,
>> 2. Specify only the visitor design pattern, and create a factory which
>> would return a visitor from a UnivariateFunction (ignoring the indices
>> of the current cell).
>
> ... this is a well-defined (limited in scope) rationalization.
> [And I'd vote for option 2.]
> But ...
>
I can start a new thread/JIRA ticket on this particular issue.

>
>> A second step could then be to remove most of the norm calculations
>> from the matrix and vector interfaces, and implement these
>> functionalities as visitors.
>>
>> Other major issues are
>>   - tagging interfaces/marker methods to dispatch the objects to the
>> most optimized algorithm (e.g. multiplication of sparse, symmetric,
>> and so on matrices). See
>> http://markmail.org/thread/vkwe5x2jtozcjkge
>> http://markmail.org/thread/j4xjdtchpw33xpgr
>>   - implementation of "views" of a matrix/vector. IIRC, Ted suggested
>> such a feature a while ago. This would be a very useful feature (just
>> like a[3:5] in matlab, octave, python and the likes). This I think
>> would also be a very useful addition.
>
> ... this seems just "nice" (i.e. not justified by the advertised uses of
> CM).
>
While I agree with you on the tagging interfaces/marker methods issue,
I think that views would really be a useful addition. There must be a
reason why the ":" operator is implemented in matlab, python + scipy +
numpy, R, octave, and so on... I can provide you with a particular use
case I'm meeting constantly those days.

>
> Unless those people who, some time ago, expressed a willingness to
> contribute to the matrix interfaces come back, I'd much prefer to devote
> some time to get "BOBYQAOptimizer" in a better shape (Java-wise).[1] By
> the way, since there are many computations there that deal with matrices,
> it could also give some clues as to what is missing in the CM matrix
> functionality in order to improve another part of CM.
>
Unfortunately, I'll be pretty useless on this issue, as my background
in advanced numerical optimization is pretty thin...

To sum up: if this thread does not raise a major discussion, maybe we
will stick with tiny API changes. I do think that RealMatrix and
RealVector should be as similar as possible, and would like to improve
the APIs in this direction. It was never my intention to rewrite the
whole thing anyway.
What would you think of this reduced scope?

Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Refactoring the matrix and vector interfaces

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
Hello.

> as agreed, I've started a JIRA ticket on this long-standing issue (see
> MATH-765). This ticket is really meant as a summary of the discussions
> which should take place on the mailing list, so please refrain from
> adding comments (unless absolutely necessary). Children tickets will
> be linked to this parent ticket once we agree on concrete tasks.
> 
> At the moment, there is nothing really fancy on this ticket. I merely
> compared the interfaces of RealMatrix and RealVector, and drew a few
> conclusions. I would be interested by your feed-back on the comments
> I've made.

Although it is nice to work on a new design, I now have a problem with the
issue as a whole: Where are the people (and applications) that need this?
At least four persons made suggestions/comments on what was maybe wrong or
could be improved, but they either are not active developers, or are not
even posting to the ML anymore.
Thus we lack the potential "real-life" use-cases (plural) that should guide
the refactoring.
Certainly, some rationalizing can be performed for its own sake (like some of
your proposals below) but a thorough redesign runs the risk of being not
vastly better that what we current have. It's not the problem that we could
be wrong (that's allowed, I think ;-), and we'll just re-redesign later on),
but it's an issue of priority.

Is it wise to spend time working on something that nobody really needs
(judging from the number of comments which your otherwise valuable review
work has brought to this ML)?

Thus...

> 
> My first suggestion would be on the visitor design pattern vs.
> map(UnivariateFunction). The former is specified in the RealMatrix
> interface, the latter is specified in the RealVector abstract class. I
> think both concepts are similar, and both are useful:
>   - visitors know about the cell they are visiting,
>   - map() doesn't.
> Maybe it would be nice as a first step to unify these concepts. Two
> options there
> 1. Specify both in both interfaces,
> 2. Specify only the visitor design pattern, and create a factory which
> would return a visitor from a UnivariateFunction (ignoring the indices
> of the current cell).

... this is a well-defined (limited in scope) rationalization.
[And I'd vote for option 2.]
But ...

> A second step could then be to remove most of the norm calculations
> from the matrix and vector interfaces, and implement these
> functionalities as visitors.
> 
> Other major issues are
>   - tagging interfaces/marker methods to dispatch the objects to the
> most optimized algorithm (e.g. multiplication of sparse, symmetric,
> and so on matrices). See
> http://markmail.org/thread/vkwe5x2jtozcjkge
> http://markmail.org/thread/j4xjdtchpw33xpgr
>   - implementation of "views" of a matrix/vector. IIRC, Ted suggested
> such a feature a while ago. This would be a very useful feature (just
> like a[3:5] in matlab, octave, python and the likes). This I think
> would also be a very useful addition.

... this seems just "nice" (i.e. not justified by the advertised uses of
CM).

> These are just a few thoughts, I'm curious to read what everyone
> thinks. Thanks beforehand for your feedback!

Unless those people who, some time ago, expressed a willingness to
contribute to the matrix interfaces come back, I'd much prefer to devote
some time to get "BOBYQAOptimizer" in a better shape (Java-wise).[1] By
the way, since there are many computations there that deal with matrices,
it could also give some clues as to what is missing in the CM matrix
functionality in order to improve another part of CM.


Best,
Gilles

[1] In its current state, it fails many basic goals of CM: robustness,
    efficiency, programming style, code clarity, etc.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org