You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <st...@yahoo.com> on 2003/11/12 07:52:47 UTC

[math] re: move to Apache Commons

Here are some comments that may help with the decision on whether or not to
move commons math to Apache Commons and an alternative suggestion.  Sorry
about the length.  I really did try to edit it down�

Commons math originated from a suggestion to add some things to lang's math
package.  The original vision was really a lang-like extension to the JDK
to enable relatively simple and commonly used math and stat functions in
java applications, without requiring large libraries, JNI or other such
baggage.  It quickly became apparent that a) "natural" scope boundaries
were very hard to draw for this kind of thing and b) there was a natural
tension between "providing a simple solution" and "providing the basis for
extension and alternative implementation strategies."   What has emerged is
sort of a mix between the original lang-like extension and something more
headed in the direction of a framework for math/stat computing in Java. 
Recent comments have suggested that Java may not be suitable for numerical
computation (a view that I do not share) and there have been discussions
about various forms of extension beyond Java.  I have always maintained
that the simple lang-like extension stuff fits in Jakarta Commons, while
the math/stat framework stuff does not.  I think that it is to accommodate
the framework and non-Java development ideas that Robert is recommending
the move to Apache Commons.  I agree with him.  I would recommend, however,
that the proposal be rewritten to reflect the broader scope.

Another logical possibility is to split the project into an "Apache math
framework" in Apache Commons and "mathUtils" or some such in Jakarta
Commons. The problems with this approach are of course the need to build
and maintain separate (though there could be lots of overlap) communities,
maintaining consistency / interoperability and managing dependencies.  Of
course, all of this could happen in "Apache math" subprojects (as Robert
suggests); but there might be some advantages to keeping the "utils"
project within Jakarta Commons and allowing it to proceed independently. 
Among these might be � sacreligious as it may sound � the ability to
duplicate numerics, embedded in direct implementations in �mathUtils� and
wrapped in frameworks, using abstract math objects, etc. in the �Apache
math framework�.  I know that sounds lazy and violates the basic principles
of reuse, but in some cases independent development actually leads to
better, more focussed solutions.   

To be fair, I have to admit that the suggestion above is partly motivated
by my personal desire to contribute to (and use) the "mathUtils" stuff and
my lack of alignment with some design/refactoring decisions made some time
ago that moved things more in the "framework" direction in commons math. 
Feel free therefore to disregard these ideas as psychotic ramblings, or
just random noise from the peanut gallery ;-)

Phil

"I am the king of Spain!"
-Nikokai Gogol,  "Diary of a Madman"


__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] re: move to Apache Commons

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Phil Steitz wrote:
> Recent comments have suggested that Java may not be suitable for numerical
> computation (a view that I do not share)

Well, I think this should be put into context. Let's examine some
approaches to numerical computation:
1. A cleanly designed library of commonly used, somewhat low level
  functionality, like [math]. While it is relatively easy to build
  solutions for complex problems, this approach suffers from lots
  of temporary object creation and data copying. This is hard to avoid
  in Java without giving up data encapsulation and providing ample
  opportunity to users for shooting themselves in the foot. See the
  constructor of CubicSplineFunction for an example.
2. Provide solutions to higher level problems. Inline the code for
  the lower level functionality, do memory allocation less dynamically
  and weight the usage of abstractions carefully against possible
  object proliferation. For example in a ray tracer, use the vector
  components explicitely instead of using vector objects.
  This is, in general, noticably faster than approach 1, and an increase
  of *two* orders of magnitude is possible, although not necessarily
  common.
  Profiled and well optimized Java code run on a HotSpot JVM can be on
  par with average C code with regards to performance.
3. Get a highly optimized C/C++/FORTRAN library (possibly including a
  compiler), which takes processor architecture, cache size and
  organization and whatnot into account. A performance improvement of
  another order of magnitude compared to approach 2 is not unheard of.
I tried an EMF simulator two years ago, and when built on a generic
Java library, very similar to RealMatrixImpl, a 1000x1000x1000 data
point simulation run all night. Switching to approach 2 brought it
down to roughly 5min, barely enough to fetch a coffee. The real good
stuff, using C and tricky algorithms specifically designed for EMF
simulations, is nearly interactive, in the 5..10s range.

Summary: whether Java programs can be used for tackling numerical
problems depends on the problem, the problem size, how you want
to solve it, and the tradeoffs you are willing to consider.

> some design/refactoring decisions made some time
> ago that moved things more in the "framework" direction in commons math.

Hm. There are reasons that there are usually a bunch of different
algorithms for solving seemingly the same problem. Which specific
algorithm should be used can heavily depend on the higher level
problem, and a good choice can be a really huge win. And yes,
unsuspecting users are regularly bitten by stock textbook solutions
which are either much too slow or fail unexpectedly.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] re: move to Apache Commons

Posted by robert burrell donkin <ro...@blueyonder.co.uk>.
On 14 Nov 2003, at 04:28, Brent Worden wrote:
>> -----Original Message-----
>> From: Phil Steitz [mailto:steitzp@yahoo.com]
>> Sent: Wednesday, November 12, 2003 12:53 AM
>> To: commons-dev@jakarta.apache.org
>> Subject: [math] re: move to Apache Commons

<snip>

>>  I think that it is to accommodate
>> the framework and non-Java development ideas that Robert is 
>> recommending
>> the move to Apache Commons.  I agree with him.
>
> Any non-Java work definitely does not belong in Jakarta, Commons or
> elsewhere.

allowing non-java code isn't the reason why i'm recommending the move. 
moving the apache commons would allow greater freedom than we can 
realistically allow here.

i'm against spinning off any more mailing lists from here. a single 
mailing list has proved very good at creating a single community (and 
prevents concerns about the quality of supervision). that the math 
community seeks a separate mailing list for is (IMHO) a sign that math 
is ready to move on.

the jakarta-commons has a set of goals and rules. if math remains here, 
it would need to stay focussed on it's original charter. this means a 
single, lightweight component. yes, new related components could be 
developed but the process here is (by necessity) quite a long one.

in apache commons, the math group could work on a collection of 
different products. one might be a more developed framework. another 
might be a lightweight business component (as the proposal originally 
specified). we might also consider continuing development of 
mathematical projects started elsewhere but which have run into 
difficulties (but we should ask for the support of the original authors 
first, of course - ideally, they would come on board as committers). 
there are a lot of possibilities which open up more easily there than 
here.

the links from jakarta (and from the jakarta commons) would be retained.

- robert


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] re: move to Apache Commons

Posted by Phil Steitz <ph...@steitz.com>.
Brent Worden wrote:
>>-----Original Message-----
>>From: Phil Steitz [mailto:steitzp@yahoo.com]
>>Sent: Wednesday, November 12, 2003 12:53 AM
>>To: commons-dev@jakarta.apache.org
>>Subject: [math] re: move to Apache Commons
>>
>>I have always maintained
>>that the simple lang-like extension stuff fits in Jakarta Commons, while
>>the math/stat framework stuff does not.
> 
> 
> I partially disagree with the framework comment. Mainly, because a precedent
> has been set with commons-logging for allowing such a framework as
> envisioned by the [math] members.  Quoting the [logging] home page: "The
> Logging package is an ultra-thin bridge between different logging libraries.
> Commons components may use the Logging API to remove compile-time and
> run-time dependencies on any particular logging package, and contributors
> may write Log implementations for the library of their choice."  I foresee
> the proposed [math] API as providing the same purpose; providing a
> mathematical API where contributors may write "implementations for the
> library of their choice."

Good point. I guess it really keeps coming back to a discussion of scope 
(as you point out below).

>> I think that it is to accommodate
>>the framework and non-Java development ideas that Robert is recommending
>>the move to Apache Commons.  I agree with him.
> 
> 
> Any non-Java work definitely does not belong in Jakarta, Commons or
> elsewhere.
> 
> 
>>I would recommend, however,
>>that the proposal be rewritten to reflect the broader scope.
> 
> 
> No problem with that.  I will concede, that the [math] group is, IMO, trying
> to take on too many endeavors at once and maybe a reality check is in order.
> 
> In the near-term for [math], this is what I would like to see:
> 1) a 1.0 release
> 2) expand on the 1.0 features for the next release (i.e. add more
> distributions, hypothesis tests, root finders, etc.).
> 3) add ONE new math vertical/discipline for the next release.  For instance,
> we could chose to add a FFT implementation which some people have expressed
> a desire to have.
> 4) make another release.
> 
> For the long-term, I would just keep repeating that cycle.  I think this
> would keep the [math] contributors primarily focused on the things you care
> about, the mathUtils portion, and with some attention allotted to broadening
> [math] into the package of our dreams (or nightmares depending on your point
> of view).

Sounds reasonable to me.

> 
> Brent Worden
> http://www.brent.worden.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] re: move to Apache Commons

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Brent Worden wrote:
> In the near-term for [math], this is what I would like to see:
> 1) a 1.0 release
> 2) expand on the 1.0 features for the next release (i.e. add more
> distributions, hypothesis tests, root finders, etc.).
> 3) add ONE new math vertical/discipline for the next release.  For instance,
> we could chose to add a FFT implementation which some people have expressed
> a desire to have.
> 4) make another release.

+1

J.Pietschmann

> For the long-term, I would just keep repeating that cycle.  I think this
> would keep the [math] contributors primarily focused on the things you care
> about, the mathUtils portion, and with some attention allotted to broadening
> [math] into the package of our dreams (or nightmares depending on your point
> of view).

Hehe. :-)

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] re: move to Apache Commons

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.

Al Chou wrote:

>>2) expand on the 1.0 features for the next release (i.e. add more
>>distributions, hypothesis tests, root finders, etc.).
> 

Continue to improve and stabilize our current implementations. Continue 
a strong effort to invite other developers and other open source groups 
to consider Jakarta Math as a vehicle for solid, community based, 
algorithm development.

+1


>>3) add ONE new math vertical/discipline for the next release.  For instance,
>>we could chose to add a FFT implementation which some people have expressed
>>a desire to have.
> 

Modularization and a parent math project are key factors here, if we're 
eventually going for domain specific applications.

> 
> I think any decisions in the vein of (3) should be largely use-case-driven. 

+1

> While it's fun to design and talk about fancy functionality, I have to ask what
> use something like (not to pick on this specifically, but it's a good example
> to my mind) numerical differentiation is likely to be put to by typical
> programmers.  

I can't comment specifically on numerical differentiation.

> Because I've swung in my programming tasks from adaptive-stepsize
> numerical ODE integration to mostly automating/scripting OS tasks, I have no
> experience outside of full numerical computing of what someone would want some
> of Commons Math's current and currently discussed future functionality for.  I
> keep harping on this point, but I'd love to hear about actual usage examples
> from anyone out there.
> 

There are many emerging use cases for numerical libraries in java. My 
particular interest is beginning to lean more and more towards high 
speed computing grids, as these grids continue to evolve, your going to 
see java become a greater and greater player, and your going to see many 
of the efforts Apache has been putting in to Java come to fruition. 
Major grid infrastructure providers are already heavily leaning on java 
for distributed computing.

I suspect allot of these tools capabilities are already leaning on 
Apache Jakarta and XML technologies, especially with all the efforts of 
the JPackage group to standardize distributing Java on Linux.

Distributed computing is the epitome of "write once, run everywhere". In 
heterogeneous distributed environments, the only consistency we may be 
able to count on consistently is Java. And I'm not specifically speaking 
about straight number crunching (in which different schools will always 
argue about performance capabilities), but also in monitoring and 
summarizing distributed computations in real time as well.

-Mark

-- 
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


RE: [math] re: move to Apache Commons

Posted by Al Chou <ho...@yahoo.com>.
--- Brent Worden <br...@worden.org> wrote:
> 
> > -----Original Message-----
> > From: Phil Steitz [mailto:steitzp@yahoo.com]
> > Sent: Wednesday, November 12, 2003 12:53 AM
> > To: commons-dev@jakarta.apache.org
> > Subject: [math] re: move to Apache Commons
> >
> > I have always maintained
> > that the simple lang-like extension stuff fits in Jakarta Commons, while
> > the math/stat framework stuff does not.
...
> In the near-term for [math], this is what I would like to see:
> 1) a 1.0 release
> 2) expand on the 1.0 features for the next release (i.e. add more
> distributions, hypothesis tests, root finders, etc.).

+1


> 3) add ONE new math vertical/discipline for the next release.  For instance,
> we could chose to add a FFT implementation which some people have expressed
> a desire to have.

I think any decisions in the vein of (3) should be largely use-case-driven. 
While it's fun to design and talk about fancy functionality, I have to ask what
use something like (not to pick on this specifically, but it's a good example
to my mind) numerical differentiation is likely to be put to by typical
programmers.  Because I've swung in my programming tasks from adaptive-stepsize
numerical ODE integration to mostly automating/scripting OS tasks, I have no
experience outside of full numerical computing of what someone would want some
of Commons Math's current and currently discussed future functionality for.  I
keep harping on this point, but I'd love to hear about actual usage examples
from anyone out there.


> 4) make another release.

+1 <g>  Of course I want to see us keep going!



Al

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


RE: [math] re: move to Apache Commons

Posted by Brent Worden <br...@worden.org>.
> -----Original Message-----
> From: Phil Steitz [mailto:steitzp@yahoo.com]
> Sent: Wednesday, November 12, 2003 12:53 AM
> To: commons-dev@jakarta.apache.org
> Subject: [math] re: move to Apache Commons
>
> I have always maintained
> that the simple lang-like extension stuff fits in Jakarta Commons, while
> the math/stat framework stuff does not.

I partially disagree with the framework comment. Mainly, because a precedent
has been set with commons-logging for allowing such a framework as
envisioned by the [math] members.  Quoting the [logging] home page: "The
Logging package is an ultra-thin bridge between different logging libraries.
Commons components may use the Logging API to remove compile-time and
run-time dependencies on any particular logging package, and contributors
may write Log implementations for the library of their choice."  I foresee
the proposed [math] API as providing the same purpose; providing a
mathematical API where contributors may write "implementations for the
library of their choice."

>  I think that it is to accommodate
> the framework and non-Java development ideas that Robert is recommending
> the move to Apache Commons.  I agree with him.

Any non-Java work definitely does not belong in Jakarta, Commons or
elsewhere.

> I would recommend, however,
> that the proposal be rewritten to reflect the broader scope.

No problem with that.  I will concede, that the [math] group is, IMO, trying
to take on too many endeavors at once and maybe a reality check is in order.

In the near-term for [math], this is what I would like to see:
1) a 1.0 release
2) expand on the 1.0 features for the next release (i.e. add more
distributions, hypothesis tests, root finders, etc.).
3) add ONE new math vertical/discipline for the next release.  For instance,
we could chose to add a FFT implementation which some people have expressed
a desire to have.
4) make another release.

For the long-term, I would just keep repeating that cycle.  I think this
would keep the [math] contributors primarily focused on the things you care
about, the mathUtils portion, and with some attention allotted to broadening
[math] into the package of our dreams (or nightmares depending on your point
of view).

Brent Worden
http://www.brent.worden.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org