You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2003/07/02 06:07:37 UTC

[math] abstact nonsense was Re: [math][functor] More Design Concerns

The changed subject line is a pun that I hope none will find insulting - 
sort of a little math joke. "Abstract nonsense" is the term that some 
mathematicians (including some who love the stuff) use to refer to 
category theory, the birthplace of the functor concept.  To conserve 
bandwidth, I am going to try to respond to the whole thread in one message.

First, I agree that the funtor concept, or more importantly functional 
programming, represents a very powerful technique that is certainly 
widely relevant and applicable to mathematical programming. Exactly what 
is relevant and useful to commons-math, however, is not obvious to me. 
Brent's examples are not compelling to me.  My main concern is that at 
least initially, commons-math is primarily an applied math package, 
aimed at direct applications doing computations with real and complex 
numbers.  I do not see strictly mathematical applications as in scope -- 
at least initially.  By this I mean things like applications to finite 
fields, groups, etc, which is where I personally see the value of the 
"abstract nonsense" really kicking in.

As I said in an earlier post, I do not see the main distinction to be 
between "objects" and "primitives" but rather between reals, integers, 
complex numbers and more abstract mathematical objects such as group, 
field, ring elements or elements of topological spaces with certain 
properties. To me, doubles are "natively supported reals" and these are 
by far the most important objects that any applied math package will 
ever work with.  Almost every (another little pun) real statistical 
application uses real-valued random variables, for example.

Brent's "rootfinding" example illustrates what I mean. If this kind of 
thing is really useful, what is useful is the notion of convergence in a 
dense linear ordering without endpoints -- moderately interesting from a 
mathematical standpoint, but not compelling, IMHO from an engineering or 
applied math perspective.  The "vector convergence" example is 
contrived. What is practically valuable in the rootfinding framework is 
rootfinding for real-valued functions of a real variable.

I see no point in a) introducing the object creations/gc overhead and b) 
losing the strong typing to introduce "typeless" functors into 
commons-math at this time.  I would even go so far as to say that I 
would *never* want to see "typeless" functors introduced, even if we 
decide that we want to be Mathematica when we grow up. As and when the 
need for more abstract mathematical objects arises, we should model them 
and their morphisms directly, using naturally defined mathematical 
objects. The functor pattern could certainly play a role here; but I 
would want to see at least the algebraic properties of the morphisms 
(functors) themselves defined explicitly following some standard 
mathematical definitions. I may be manifestly missing the point of the 
o.a.c.functor package here, in which case I would appreciate (gentle) 
enlightenment.

One final point.  A few comments were made about performance and what 
commons-math should aim for. My perspective is that performance is an 
important consideration and we should avoid adding computational and/or 
resource management overhead unless there is a compelling reason to do 
so.  As David Graham pointed out in an earlier post, Jakarta Commons 
components need to target server application deployment. This means that 
we cannot do things that kill scalability, which bad performance and 
excessive resource consumption will do. While I do not see commons-math 
as a "numerics package", I do see it as a package that provides some 
basic numerical analyis capabilities and it needs to do this in as 
efficient, stable and standard a way as possible in Java.

I agree with Al that we try our best to stay focused on the actual 
application use cases and let these drive design.  From my perspective, 
what I see now are real-valued random variables, real-valued functions 
and a few other objects that we have modelled in a straigtforward way 
(e.g real matrices) that both mathematical and non-mathematical users 
will find relatively easy to understand.  I am not convinced that either 
for internal use or certainly for exposed interfaces we will get any 
value out of introducing additional abstractions at this time.  Of 
course, I may just be missing the point of Brent's utopian vision and/or 
the universal applicability of the functor concept.

Phil




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[math] Generics was: Re: [math] abstact nonsense was ...

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
Craig R. McClanahan wrote:

>On Thu, 3 Jul 2003, Al Chou wrote:
>
>  
>
>>Date: Thu, 3 Jul 2003 12:05:49 -0700 (PDT)
>>From: Al Chou <ho...@yahoo.com>
>>Reply-To: Jakarta Commons Developers List <co...@jakarta.apache.org>
>>To: Jakarta Commons Developers List <co...@jakarta.apache.org>
>>Subject: Re: [math] abstact nonsense was Re: [math][functor] More Design
>>    Concerns
>>
>>--- "Mark R. Diggory" <md...@latte.harvard.edu> wrote:
>>    
>>
>>>Anton Tagunov wrote:
>>>      
>>>
>>>>3)
>>>>
>>>>BTW, probably does the future introduction of Generics (Java 1.5)
>>>>promise any opportunities to work with primitive values and yet
>>>>have no code duplication (a bit like STL)?
>>>>
>>>>        
>>>>
>>>I've not spent much time looking at Generics yet. I have allot to learn
>>>in this area.
>>>      
>>>
>>I believe for our purposes it suffices to describe generics as the ability to
>>write something like (I'm too lazy to look up the exact syntax):
>>
>>ArrayList<Double> myArray = new ArrayList()<Double> ;
>>Double d ;
>>int position = 0 ;
>>
>>myArray.add( new Double( 1.0 ) ) ;
>>
>>// Here's the part where generics make life easier.  Look, Ma, no cast:
>>d = myArray.get( position ) ;
>>
>>    
>>
>
>My understanding is that this is exactly what you'll get from the
>auto-unboxing capability.  The compiler will be able to see that the right
>hand side returns a Double, and generate the code to unbox it into a
>double primitive for you.
>
>This is separate from Generics because it also works in other scenarios:
>
>  Double d1 = new Double(1.0); // A lowly scalar instance of the wrapper
>  double d2 = d1;              // But no cast here either!
>  d1 = d2 + 0.5;               // Or here ... it is bidirectional
>  
>
>>From http://java.sun.com/javaone you can download a webcast of the
>Technical General Session on Tuesday morning that covered the 1.5 language
>changes, including both of the topics above.  It was the first time I'd
>ever seen cheering sections for the different proposed features :-).
>
>Craig
>
>  
>
That auto-boxing capability alone would make me jump for joy!
JavaOne: a very worthwhile subscription I can get my employer to cover ;-)

-Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by Al Chou <ho...@yahoo.com>.
--- "Craig R. McClanahan" <cr...@apache.org> wrote:
> My understanding is that this is exactly what you'll get from the
> auto-unboxing capability.  The compiler will be able to see that the right
> hand side returns a Double, and generate the code to unbox it into a
> double primitive for you.
> 
> This is separate from Generics because it also works in other scenarios:
> 
>   Double d1 = new Double(1.0); // A lowly scalar instance of the wrapper
>   double d2 = d1;              // But no cast here either!
>   d1 = d2 + 0.5;               // Or here ... it is bidirectional

IMO, that's even more useful than generics (or limited C++-style templates, if
you prefer to think of them that way).  As a math-using person, I really never
wanted to have to make much of a distinction between a double and a Double
(although I somewhat understand the CS considerations that could make having
the distinction desirable).


Al

=====
Albert Davidson Chou

    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Thu, 3 Jul 2003, Al Chou wrote:

> Date: Thu, 3 Jul 2003 12:05:49 -0700 (PDT)
> From: Al Chou <ho...@yahoo.com>
> Reply-To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> Subject: Re: [math] abstact nonsense was Re: [math][functor] More Design
>     Concerns
>
> --- "Mark R. Diggory" <md...@latte.harvard.edu> wrote:
> > Anton Tagunov wrote:
> > > 3)
> > >
> > > BTW, probably does the future introduction of Generics (Java 1.5)
> > > promise any opportunities to work with primitive values and yet
> > > have no code duplication (a bit like STL)?
> > >
> >
> > I've not spent much time looking at Generics yet. I have allot to learn
> > in this area.
>
> I believe for our purposes it suffices to describe generics as the ability to
> write something like (I'm too lazy to look up the exact syntax):
>
> ArrayList<Double> myArray = new ArrayList()<Double> ;
> Double d ;
> int position = 0 ;
>
> myArray.add( new Double( 1.0 ) ) ;
>
> // Here's the part where generics make life easier.  Look, Ma, no cast:
> d = myArray.get( position ) ;
>

My understanding is that this is exactly what you'll get from the
auto-unboxing capability.  The compiler will be able to see that the right
hand side returns a Double, and generate the code to unbox it into a
double primitive for you.

This is separate from Generics because it also works in other scenarios:

  Double d1 = new Double(1.0); // A lowly scalar instance of the wrapper
  double d2 = d1;              // But no cast here either!
  d1 = d2 + 0.5;               // Or here ... it is bidirectional

> Al

>From http://java.sun.com/javaone you can download a webcast of the
Technical General Session on Tuesday morning that covered the 1.5 language
changes, including both of the topics above.  It was the first time I'd
ever seen cheering sections for the different proposed features :-).

Craig

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by Al Chou <ho...@yahoo.com>.
--- "Mark R. Diggory" <md...@latte.harvard.edu> wrote:
> Anton Tagunov wrote:
> > 3)
> > 
> > BTW, probably does the future introduction of Generics (Java 1.5)
> > promise any opportunities to work with primitive values and yet
> > have no code duplication (a bit like STL)?
> > 
> 
> I've not spent much time looking at Generics yet. I have allot to learn 
> in this area.

I believe for our purposes it suffices to describe generics as the ability to
write something like (I'm too lazy to look up the exact syntax):

ArrayList<Double> myArray = new ArrayList()<Double> ;
Double d ;
int position = 0 ;

myArray.add( new Double( 1.0 ) ) ;

// Here's the part where generics make life easier.  Look, Ma, no cast:
d = myArray.get( position ) ;


Al

=====
Albert Davidson Chou

    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.

Anton Tagunov wrote:
> Hello Mark!
> 
> 1)
> 
> MRD> One idea is it have Custom Iterators. A Custom Iterator could walk
> MRD> through the objects in a collection (or the double values in an array)
> 
> These iterators would also provide other nice capability,
> receiving the values incrementally, "on the fly"
> (say, from an InputStream or properly obtained
> MySQL result set) w/o having to allocate
> intermediate storage for them.

Yes these are excellent examples of where custom iterators could come in 
handy. I like the InputStream idea, I could also imagine something along 
the lines of a SAX XMLFilter approach that could take SAX Events and 
process them to generate stats.

> 
> 2)
> 
> MRD> Then its up to the implementor of the Iterator how "efficient" it works
> MRD> with the collection or double[], in the double[] case it can just return the
> MRD> value, in the Collection case it may preform a number of tasks prior to 
> MRD> returning a value.
> 
> Looks like it still has to be decided upon
> whether the iterator should return double or Object
> does not it?
> 
> Are the questions
> a) double or Object
> b) iterator or not iterator
> orthogonal?


With a custom iterator, the questions become related in that the regular 
"Iterator" interface returns Objects, I've written a custom 
"DoubleIterator" interface that returns doubles as well as Objects, 
there's no reason that a "DoubleIterator" can't act as a regular 
"Iterator".

A default approach that deals with objects of "Iterator" could be 
written, in cases where it can be detected (instanceof) that we're 
working with a "DoubleIterator", the Statistic could move to using the 
DoubleIterator interface instead.


> 
> 3)
> 
> BTW, probably does the future introduction of Generics (Java 1.5)
> promise any opportunities to work with primitive values and yet
> have no code duplication (a bit like STL)?
> 

I've not spent much time looking at Generics yet. I have allot to learn 
in this area.

> 4)
> 
> Apologies, if this break-in was totally "the wrong sound",
> I certainly lack the knowledge of the current math code
> and interfaces, speaking more "in theory" :-)
> 
> -Anton

No, glad to see more involvement and input from others outside the 
group, we always need more input.

> 
> P.S.
> 
> As for the 'double' vs 'Object' issue, if I ever have to use
> a math library I, as an at most purely applied mathematician,
> (yup, my diploma says I am, but I really doubt that myself :)
> will probably prefer 'double'.
> 

I think as well in instances where its important we should always try to 
get down to "doubles". Object bridging to double can always be part of a 
higher architecture or user implementation.

> But it's not a qualified opinion, plz disregard it :)
> 
> I wouldn't write a separate mail on this as there
> already are qualified advocates for both viewpoints!
> 
> The Funtor-style approach looks promising but I've got deep
> reservations about performance (and yes, that's what C-background
> guys will probably think, not only Fortran-background! ;-)
> 

Yes, this has been that argument to date, but I think that its possible 
to organize the code in a modular fashion without impacting performance 
heavily. We are definitely taking such issues into heavy consideration 
in the direction such projects will take the codebase.

-thanks for the input,
Mark



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re[2]: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by Anton Tagunov <at...@mail.cnt.ru>.
Hello Mark!

1)

MRD> One idea is it have Custom Iterators. A Custom Iterator could walk
MRD> through the objects in a collection (or the double values in an array)

These iterators would also provide other nice capability,
receiving the values incrementally, "on the fly"
(say, from an InputStream or properly obtained
MySQL result set) w/o having to allocate
intermediate storage for them.

2)

MRD> Then its up to the implementor of the Iterator how "efficient" it works
MRD> with the collection or double[], in the double[] case it can just return the
MRD> value, in the Collection case it may preform a number of tasks prior to 
MRD> returning a value.

Looks like it still has to be decided upon
whether the iterator should return double or Object
does not it?

Are the questions
a) double or Object
b) iterator or not iterator
orthogonal?

3)

BTW, probably does the future introduction of Generics (Java 1.5)
promise any opportunities to work with primitive values and yet
have no code duplication (a bit like STL)?

4)

Apologies, if this break-in was totally "the wrong sound",
I certainly lack the knowledge of the current math code
and interfaces, speaking more "in theory" :-)

-Anton

P.S.

As for the 'double' vs 'Object' issue, if I ever have to use
a math library I, as an at most purely applied mathematician,
(yup, my diploma says I am, but I really doubt that myself :)
will probably prefer 'double'.

But it's not a qualified opinion, plz disregard it :)

I wouldn't write a separate mail on this as there
already are qualified advocates for both viewpoints!

The Funtor-style approach looks promising but I've got deep
reservations about performance (and yes, that's what C-background
guys will probably think, not only Fortran-background! ;-)

And also if all we need are real value computations,
the 'double' interfaces are probably more "usable",
"user-friendly" and less "scaring".


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
Phil Steitz wrote:

> Brent Worden wrote:
>
>>
>> Unfortunately, Java has created a huge distinction between objects and
>> primitives.  They're incompatible types.  Objects have to be treated 
>> in a
>> distinctly different manner than primitive values.  I prefer objects 
>> over
>> primitives because the other commons projects we depend on are built 
>> around
>> objects.  For instance, 90%+ of the functionality in 
>> commons-collections is
>> geared towards objects and unusable by our primitive approach.  I 
>> wager we
>> could see a significant code reduction in the univariate classes if 
>> we could
>> incorporate some of those object driven routines.  Yeah.
>>
> I am not convinced of this.  There really is not that much code there. 
> If what you think you can eliminate is all of the DoubleArray stuff, 
> that is probably true but at a significant loss of performance and 
> flexibility.  I would in any case always want to keep the array-based 
> implementations for speed and ease of use.  That would result in code 
> swell (and smell).  Yuk.

Commons [collections] has a task I've been considering working on, 
having to do with "primitive array collections", I think that our 
DoubleArray objects can eventually be "Collection" objects themselves 
(With our current DoubleArray interface as a "Primitive Array 
Collection" interface) It is not out of the realm of possibility to have 
DoubleArrays polymorph into Collection's without impacting our API in 
the least bit. After such an adoption, the lines between Collection and 
DoubleArray objects become less cumbersome.

>
>>
>>> but rather between reals, integers,
>>> complex numbers and more abstract mathematical objects such as group,
>>> field, ring elements or elements of topological spaces with certain
>>> properties. To me, doubles are "natively supported reals" and these are
>>> by far the most important objects that any applied math package will
>>> ever work with.  Almost every (another little pun) real statistical
>>> application uses real-valued random variables, for example.
>>
>>
>>
>> Statistical data analysis also involves dates, times, categories, 
>> etc.  None
>> of which can be handled by the univariate classes without converting 
>> them to
>> doubles before adding them to the container and reversing the conversion
>> when accessing metrics.  This is hardly convenient to the user.
>
>
> You are missing the point. To use the continuous methods, you *must* 
> convert to real in any case.  I would prefer to have the user control 
> this conversion.  Think through the use cases.  What does the mean of 
> a collection of dates mean? Need to decide discrete vs continuous and 
> set up a mapping -- a *random variable*.  I would prefer to let the 
> user do this explicitly and provide efficient, well-documented 
> computation support in commons-math. For the discrete case, I agree 
> that Frequency can certainly be improved/extended to accommodate 
> different sorts of objects, but there again, it is going to come down 
> to string representation of the discrete values and then floating 
> point computations to analyze the distributions.  I think that Tim's 
> "BeanList" stuff is the kind of thing that we should be looking at in 
> terms of extending to support collections, but even there the linkage 
> to the core computational infrastructure is real-valued properties.


Yes, these are strong points. The user needs to be able to control what 
which objects and which methods are used in such a calculation. I was 
looking over BeanListUnivariate since you brought it up as an example, 
and I now have a stronger criticism about the design. We see the 
downfall of things like having the "addValue" method in the Univarate 
interface here. It is a "storage" related method. In this case there are 
methods "cropping" up with do not fit properly with the Univariate 
Interface and there are methods in the Univariate Interface itself, 
which can no longer be implemented in ListUnivariates.

    /* (non-Javadoc)
     * @see org.apache.commons.math.Univariate#addValue(double)
     */
    public void addValue(double v) {
        String msg = "The BeanListUnivariateImpl does not accept values " +
            "through the addValue method.  Because elements of this list " +
            "are JavaBeans, one must be sure to set the 'propertyName' " +
            "property and add new Beans to the underlying list via the " +
            "addBean(Object bean) method";
        throw new UnsupportedOperationException( msg );
    }

    /**
     * Adds a bean to this list.
     *
     * @param bean Bean to add to the list
     */
    public void addObject(Object bean) {
        list.add(bean);
    }

I think we've determined through experience that having interface 
methods which cannot be supported across all implementations is a poor 
design. This is an example of where the current design is failing.  In 
this case what I feel we are seeing is too much of the "type" of the 
"Data Structure" getting bound up in the "statistical operation". Here 
is where separating the concerns of DataStorage and Operation is 
important. This is where the idea of Mathematical Operation Functors 
come into play. Now, arguments about if the actual methods that calc 
results should stay double or Object oriented (in input or output) is a 
smaller argument in the bigger problem observed above. It would be nice 
to have a simple means to take a collection, define the objects that you 
want to collect info on, define the method/bean property which values 
will be gathered from. and have the particular statistic your evaluating 
not have to implement methods to handle these details. 
BeanListUnivariate is a nice first pass, be we see some problems with 
being stuck in the traditional Interface<--Implementation approach for 
this now.

One idea is it have Custom Iterators. A Custom Iterator could walk 
through the objects in a collection (or the double values in an array) 
and evaluate them to collate information, the collection contains the 
objects, the Iterator object encapsulates the functionality for 
translating/mapping between Object and the return value that will be 
applied in the Statistic. We write the Statistic, we provide some 
generic iterators that can be extended, the user extends these to work 
with their collection. Calculating a statistic on a collection is simply 
grabbing that collection, instantiating a particular iterator and 
plugging "it" (not the collection) into the statistic. Then its up to 
the implementor of the Iterator how "efficient" it works with the 
collection or double[], in the double[] case it can just return the 
value, in the Collection case it may preform a number of tasks prior to 
returning a value.

-Mark

-- 
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] Fwd: Re: Change utility class to singleton with interface?

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
This could be something a simple as:

package org.apache.commons.math.stat2;

import org.apache.commons.math.stat2.univariate.moment.Mean;
import org.apache.commons.math.stat2.univariate.moment.Variance;

public class StatUtils {
   
    private static Mean _mean;
    private static Variance _variance;
   
    public double mean(double[] values) {
        return _mean.evaluate(values);       
    }

    public double variance(double[] values) {
        return _variance.evaluate(values);       
    }
   
    static {
        _mean = new Mean();
        _variance = new Variance();
    }
}

thanks Al,
Mark

Al Chou wrote:

>The beginnings of a thread in the Refactoring Yahoo list that may be of
>interest for our current design discussions:
>
>Change utility class to singleton with interface?
>http://groups.yahoo.com/group/refactoring/message/3797
>
>
>Al
>
>=====
>Albert Davidson Chou
>
>    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .
>
>__________________________________
>Do you Yahoo!?
>SBC Yahoo! DSL - Now only $29.95 per month!
>http://sbc.yahoo.com
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[math] Fwd: Re: Change utility class to singleton with interface?

Posted by Al Chou <ho...@yahoo.com>.
The beginnings of a thread in the Refactoring Yahoo list that may be of
interest for our current design discussions:

Change utility class to singleton with interface?
http://groups.yahoo.com/group/refactoring/message/3797


Al

=====
Albert Davidson Chou

    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by Phil Steitz <ph...@steitz.com>.
Brent Worden wrote:
>>-----Original Message-----
>>From: Phil Steitz [mailto:phil@steitz.com]
>>Sent: Tuesday, July 01, 2003 11:08 PM
>>To: Jakarta Commons Developers List
>>Subject: [math] abstact nonsense was Re: [math][functor] More Design
>>Concerns
>>
>>
>>The changed subject line is a pun that I hope none will find insulting -
>>sort of a little math joke. "Abstract nonsense" is the term that some
>>mathematicians (including some who love the stuff) use to refer to
>>category theory, the birthplace of the functor concept.  To conserve
>>bandwidth, I am going to try to respond to the whole thread in
>>one message.
>>
>>First, I agree that the funtor concept, or more importantly functional
>>programming, represents a very powerful technique that is certainly
>>widely relevant and applicable to mathematical programming. Exactly what
>>is relevant and useful to commons-math, however, is not obvious to me.
>>Brent's examples are not compelling to me.  My main concern is that at
>>least initially, commons-math is primarily an applied math package,
>>aimed at direct applications doing computations with real and complex
>>numbers.
> 
> 
> Problem with this is if we add a complex number data type, none of the
> storage and computation facilities can handle the new type.  If we keep
> everything as is, the only way to support complex matrices is to duplicate
> the real matrix functionality in a new complex matrix functionality.  Yuck.
> 
For the specific case of real matrices, you definitely have a point.  My 
opinion, however, is that a) it is not obvious to me that we will be 
adding complex matrices any time soon b) the current RealMatrix 
interface and implementation make essential use of the fact that the 
entries are real -- both for computational efficiency and for ease of 
use and c) we can always insert a base "Matrix" and refactor the 
implementation later without changing the RealMatrix interface, which I 
maintain is *much* better for actual applied work than a "Field element" 
based interface forcing users to cast everything and losing the 
efficiency and ability to represent and work with large matrices that 
the native double[][] implementation provides.

> 
>> I do not see strictly mathematical applications as in scope --
>>at least initially.  By this I mean things like applications to finite
>>fields, groups, etc, which is where I personally see the value of the
>>"abstract nonsense" really kicking in.
>>
>>As I said in an earlier post, I do not see the main distinction to be
>>between "objects" and "primitives"
> 
> 
> Unfortunately, Java has created a huge distinction between objects and
> primitives.  They're incompatible types.  Objects have to be treated in a
> distinctly different manner than primitive values.  I prefer objects over
> primitives because the other commons projects we depend on are built around
> objects.  For instance, 90%+ of the functionality in commons-collections is
> geared towards objects and unusable by our primitive approach.  I wager we
> could see a significant code reduction in the univariate classes if we could
> incorporate some of those object driven routines.  Yeah.
> 
I am not convinced of this.  There really is not that much code there. 
If what you think you can eliminate is all of the DoubleArray stuff, 
that is probably true but at a significant loss of performance and 
flexibility.  I would in any case always want to keep the array-based 
implementations for speed and ease of use.  That would result in code 
swell (and smell).  Yuk.

> 
>>but rather between reals, integers,
>>complex numbers and more abstract mathematical objects such as group,
>>field, ring elements or elements of topological spaces with certain
>>properties. To me, doubles are "natively supported reals" and these are
>>by far the most important objects that any applied math package will
>>ever work with.  Almost every (another little pun) real statistical
>>application uses real-valued random variables, for example.
> 
> 
> Statistical data analysis also involves dates, times, categories, etc.  None
> of which can be handled by the univariate classes without converting them to
> doubles before adding them to the container and reversing the conversion
> when accessing metrics.  This is hardly convenient to the user.

You are missing the point. To use the continuous methods, you *must* 
convert to real in any case.  I would prefer to have the user control 
this conversion.  Think through the use cases.  What does the mean of a 
collection of dates mean? Need to decide discrete vs continuous and set 
up a mapping -- a *random variable*.  I would prefer to let the user do 
this explicitly and provide efficient, well-documented computation 
support in commons-math. For the discrete case, I agree that Frequency 
can certainly be improved/extended to accomodate different sorts of 
objects, but there again, it is going to come down to string 
representation of the discrete values and then floating point 
computations to analyze the distributions.  I think that Tim's 
"BeanList" stuff is the kind of thing that we should be looking at in 
terms of extending to support collections, but even there the linkage to 
the core computational infrastructure is real-valued properties.
> 
> 
>>Brent's "rootfinding" example illustrates what I mean. If this kind of
>>thing is really useful, what is useful is the notion of convergence in a
>>dense linear ordering without endpoints -- moderately interesting from a
>>mathematical standpoint, but not compelling, IMHO from an engineering or
>>applied math perspective.  The "vector convergence" example is
>>contrived.
> 
> 
> Are finding eigenvalues, iterative refinement of linear system solutions,
> solving of linear systems, finding roots of complex equations, finding roots
> of bivariate equations, etc. contrived examples?  No, they're are practical
> applications that could be addressed using a bisection method or other root
> finding techniques. And all of these applications could be addressed using
> the same implementation using different functors.  Yeah. Yeah.

Maybe I am dense, but I do not see the huge value of a DLO convergence 
algorithm for the mathematical problems that you are describing above. 
Sure, you could probably wrap real solutions to these problems in an 
abstract "convergence" functor, but I see no value in this.  You will 
still be doing the real work in the implementations.  If, on the other 
hand, what you were suggesting was something like an abstract field 
class or a functor class based on field operations, that might turn out 
to be useful (and in fact provide some of the basis for a generic 
Matrix), but I simply do not see the need for this now.
> 
> 
>>What is practically valuable in the rootfinding framework is
>>rootfinding for real-valued functions of a real variable.
> 
> 
> No one is talking about taking that away.  I would prefer the convenience
> methods for the standard applications stay in place.  What I would like to
> see is when a complex number data type is added to the library that the root
> finding methods we have in place can be applied to this type.  If we keep
> the solvers as is, we would have to write a whole new root finding framework
> for complex functions.  Yuck. Yuck.

I do not see "root-finding for complex functions" as something that we 
will likely ever implement in a way that is logically similar enough to 
root-finding for R->R functions that a common infrastructure will be 
useful.  I know that is a bold statement, but if you think carefully 
about the use cases, I think (hope?) that you will agree -- i.e., what 
will be left of "rootfinding" when it is abstracted to the level where 
it both makes sense and can be implemented in a utility fashion that can 
be reused across both of these domains is not much.  If you look at NR 
for example, or Colt or other numerics packages, you will see that the 
"special case" of real-valued functions (and matrices) really is 
addressed directly and specifically.  That is not just because these 
guys are all old Fortran programmers who can't handle abstraction (like 
me ;-)).

Phil

> 
> Brent Worden
> http://www.brent.worden.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


RE: [math] abstact nonsense was Re: [math][functor] More Design Concerns

Posted by Brent Worden <br...@worden.org>.
> -----Original Message-----
> From: Phil Steitz [mailto:phil@steitz.com]
> Sent: Tuesday, July 01, 2003 11:08 PM
> To: Jakarta Commons Developers List
> Subject: [math] abstact nonsense was Re: [math][functor] More Design
> Concerns
>
>
> The changed subject line is a pun that I hope none will find insulting -
> sort of a little math joke. "Abstract nonsense" is the term that some
> mathematicians (including some who love the stuff) use to refer to
> category theory, the birthplace of the functor concept.  To conserve
> bandwidth, I am going to try to respond to the whole thread in
> one message.
>
> First, I agree that the funtor concept, or more importantly functional
> programming, represents a very powerful technique that is certainly
> widely relevant and applicable to mathematical programming. Exactly what
> is relevant and useful to commons-math, however, is not obvious to me.
> Brent's examples are not compelling to me.  My main concern is that at
> least initially, commons-math is primarily an applied math package,
> aimed at direct applications doing computations with real and complex
> numbers.

Problem with this is if we add a complex number data type, none of the
storage and computation facilities can handle the new type.  If we keep
everything as is, the only way to support complex matrices is to duplicate
the real matrix functionality in a new complex matrix functionality.  Yuck.

>  I do not see strictly mathematical applications as in scope --
> at least initially.  By this I mean things like applications to finite
> fields, groups, etc, which is where I personally see the value of the
> "abstract nonsense" really kicking in.
>
> As I said in an earlier post, I do not see the main distinction to be
> between "objects" and "primitives"

Unfortunately, Java has created a huge distinction between objects and
primitives.  They're incompatible types.  Objects have to be treated in a
distinctly different manner than primitive values.  I prefer objects over
primitives because the other commons projects we depend on are built around
objects.  For instance, 90%+ of the functionality in commons-collections is
geared towards objects and unusable by our primitive approach.  I wager we
could see a significant code reduction in the univariate classes if we could
incorporate some of those object driven routines.  Yeah.

> but rather between reals, integers,
> complex numbers and more abstract mathematical objects such as group,
> field, ring elements or elements of topological spaces with certain
> properties. To me, doubles are "natively supported reals" and these are
> by far the most important objects that any applied math package will
> ever work with.  Almost every (another little pun) real statistical
> application uses real-valued random variables, for example.

Statistical data analysis also involves dates, times, categories, etc.  None
of which can be handled by the univariate classes without converting them to
doubles before adding them to the container and reversing the conversion
when accessing metrics.  This is hardly convenient to the user.

> Brent's "rootfinding" example illustrates what I mean. If this kind of
> thing is really useful, what is useful is the notion of convergence in a
> dense linear ordering without endpoints -- moderately interesting from a
> mathematical standpoint, but not compelling, IMHO from an engineering or
> applied math perspective.  The "vector convergence" example is
> contrived.

Are finding eigenvalues, iterative refinement of linear system solutions,
solving of linear systems, finding roots of complex equations, finding roots
of bivariate equations, etc. contrived examples?  No, they're are practical
applications that could be addressed using a bisection method or other root
finding techniques.  And all of these applications could be addressed using
the same implementation using different functors.  Yeah. Yeah.

> What is practically valuable in the rootfinding framework is
> rootfinding for real-valued functions of a real variable.

No one is talking about taking that away.  I would prefer the convenience
methods for the standard applications stay in place.  What I would like to
see is when a complex number data type is added to the library that the root
finding methods we have in place can be applied to this type.  If we keep
the solvers as is, we would have to write a whole new root finding framework
for complex functions.  Yuck. Yuck.

Brent Worden
http://www.brent.worden.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org