You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@gmail.com> on 2021/05/23 14:57:53 UTC

[Math][Numbers][Geometry][Statistics] Road map for next release(s)

Hi.

Following recent discussions (with too few participants), no
consensus emerged about the best way to support the [Math]
component.

I've created a multi-module[1] version of the code base with a
corresponding JIRA issue:
    https://issues.apache.org/jira/browse/MATH-1575

The new layout of the [Math] maven project is in a "git" branch
named "modularized_master":
    https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/modularized_master

It already features several modules:
  * commons-math-transform
  * commons-math-neuralnet
  * commons-math-legacy

There is also
  * commons-math-examples
with "sub-modules" each with an executable application. [See also
MATH-1580]

Branch "modularized_master" is available for review.
[Help needed for the "CheckStyle" issue (MATH-1576).]

Module "commons-math-legacy" contains the codes that haven't
yet been refactored into specific functionalities in order to make it
into a dedicated module.

Functionalities that were discussed relatively recently (candidate
for modularization):
  * Genetic algorithm (in "o.a.c.math4.legacy.genetics")
  * Clustering (in "o.a.c.math4.legacy.ml.clustering")
  * Regression (in "o.a.c.math4.stat.regression")
  * Alternative to JDK "Math" class (in "o.a.c.math4.util.FastMath")
  * ...

Are people (Avijit Basak, Erik Svensson, Samy Badjoudj, ...) who
expressed interest in these areas of CM still willing to contribute?
[Please start new threads for discussing the specifics of each
candidate module.]
Module "neuralnet" can serve as a template and illustrates the
refactoring aimed at a library JAR depending Java 8 and on truly
low-level Commons components, such as [RNG] or [Numbers],
and *not* depending on the "legacy" module.

The upcoming version of CM would depend on (non-beta) releases of
  * Commons Numbers
  * Commons Geometry
  * Commons Statistics

Any objection to have those released, and then CM v4.0, ASAP?

Regards,
Gilles

[1] This will unfortunately not fix the (design and maintenance) issues
exposed along the years.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Posted by Erik Svensson <Er...@nasdaq.com>.
Hello!

I/We are still willing to contribute. Sadly, more urgent things have had to be handled, thus my silence. 
I will look at the jira and I will try to formulate some sort of proposal.

/Erik

Erik Svensson
Principal Architect
Strategic Programs, Platform & Product Engineering
 <http://www.nasdaq.com/>
Desk
Mobile
Email
Address  
+ 46 8 405 66 39
+ 46 73 449 66 39
erik.svensson@nasdaq.com
Tullvaktsvägen 15, Stockholm
 <https://www.facebook.com/nasdaq/>
 <https://twitter.com/nasdaq>
 <https://www.linkedin.com/company/nasdaq>
 <https://www.instagram.com/nasdaq/>
 <https://www.pinterest.com/nasdaq/>
rewritetomorrow.com <http://rewritetomorrow.com/>
 

On 2021-05-23, 16:58, "Gilles Sadowski" <gi...@gmail.com> wrote:

    WARNING - External email; exercise caution.

    Hi.

    Following recent discussions (with too few participants), no
    consensus emerged about the best way to support the [Math]
    component.

    I've created a multi-module[1] version of the code base with a
    corresponding JIRA issue:
        https://issues.apache.org/jira/browse/MATH-1575

    The new layout of the [Math] maven project is in a "git" branch
    named "modularized_master":
        https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/modularized_master

    It already features several modules:
      * commons-math-transform
      * commons-math-neuralnet
      * commons-math-legacy

    There is also
      * commons-math-examples
    with "sub-modules" each with an executable application. [See also
    MATH-1580]

    Branch "modularized_master" is available for review.
    [Help needed for the "CheckStyle" issue (MATH-1576).]

    Module "commons-math-legacy" contains the codes that haven't
    yet been refactored into specific functionalities in order to make it
    into a dedicated module.

    Functionalities that were discussed relatively recently (candidate
    for modularization):
      * Genetic algorithm (in "o.a.c.math4.legacy.genetics")
      * Clustering (in "o.a.c.math4.legacy.ml.clustering")
      * Regression (in "o.a.c.math4.stat.regression")
      * Alternative to JDK "Math" class (in "o.a.c.math4.util.FastMath")
      * ...

    Are people (Avijit Basak, Erik Svensson, Samy Badjoudj, ...) who
    expressed interest in these areas of CM still willing to contribute?
    [Please start new threads for discussing the specifics of each
    candidate module.]
    Module "neuralnet" can serve as a template and illustrates the
    refactoring aimed at a library JAR depending Java 8 and on truly
    low-level Commons components, such as [RNG] or [Numbers],
    and *not* depending on the "legacy" module.

    The upcoming version of CM would depend on (non-beta) releases of
      * Commons Numbers
      * Commons Geometry
      * Commons Statistics

    Any objection to have those released, and then CM v4.0, ASAP?

    Regards,
    Gilles

    [1] This will unfortunately not fix the (design and maintenance) issues
    exposed along the years.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
    For additional commands, e-mail: dev-help@commons.apache.org


*******************************************
CONFIDENTIALITY AND PRIVACY NOTICE: This e-mail and any attachments are for the exclusive and confidential use of the intended recipient and may constitute non-public information.  Personal data in this email is governed by our Privacy Policy at  https://www.nasdaq.com/privacy-statement  unless explicitly excluded from it; please see the section in the policy entitled “Situations Where This Privacy Policy Does Not Apply” for circumstances where different privacy terms govern emailed personal data.  If you received this e-mail in error, disclosing, copying, distributing or taking any action in reliance of this e-mail is strictly prohibited and may be unlawful. Instead, please notify us immediately by return e-mail and promptly delete this message and its attachments from your computer system. We do not waive any work product or other applicable legal privilege(s) by the transmission of this message.
*******************************************

Re: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Posted by Gilles Sadowski <gi...@gmail.com>.
Le dim. 23 mai 2021 à 22:54, Alex Herbert <al...@gmail.com> a écrit :
>
> On Sun, 23 May 2021 at 15:58, Gilles Sadowski <gi...@gmail.com> wrote:
>
> >
> > I've created a multi-module[1] version of the code base with a
> > corresponding JIRA issue:
> >     https://issues.apache.org/jira/browse/MATH-1575
>
>
> Thanks. This is more maintainable going forward.
>
>
> > The upcoming version of CM would depend on (non-beta) releases of
> >   * Commons Numbers
> >   * Commons Geometry
> >   * Commons Statistics
> >
> > Any objection to have those released, and then CM v4.0, ASAP?
> >
> [...]
>
>
> Do you propose to release v4 to get a release out with all recent bug fixes

Yes.

> and then work on v5 to resolve major design issues?

This could be done incrementally in 4.x releases if new modules are added
and the corresponding legacy code marked as deprecated but not removed.

> Or can the
> design issues be isolated to packages and thus v4 would not include
> packages that require a major redesign?

It would not be easy to sort out what must be released because some
bug was fixed from what should not be because some bug wasn't
fixed yet...

> For example I recall there are
> issues with math3.stat.descriptive.moment but the attempt to move these to
> Statistics in GSOC 2019 was not completed. So for example could the entire
> stat package not be released in v4 and new code would be targeted to
> Statistics?

There are design issues (big or small) everywhere; if some "legacy" codes
are not released, it would entail that we support 2 versions of the
library (i.e.
the old v3.6.1 for all the bits not released in v4.0).

The overall problem with CM is that we effectively do not support the whole
code base (e.g. several reported bugs linger due to lack of expertise in the
concerned area).
Of course, it was the main reason for developing and releasing more focused
components that were within the area of interest of the currently active
developers.

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Posted by Gilles Sadowski <gi...@gmail.com>.
Le dim. 23 mai 2021 à 22:54, Alex Herbert <al...@gmail.com> a écrit :
>
> On Sun, 23 May 2021 at 15:58, Gilles Sadowski <gi...@gmail.com> wrote:
>
> >
> > I've created a multi-module[1] version of the code base with a
> > corresponding JIRA issue:
> >     https://issues.apache.org/jira/browse/MATH-1575
>
>
> Thanks. This is more maintainable going forward.
>
> [...]
>
> Do you propose to release v4 to get a release out with all recent bug fixes
> and then work on v5 to resolve major design issues? Or can the
> design issues be isolated to packages and thus v4 would not include
> packages that require a major redesign? [...]

As an intermediate step (which Samy Badjoudj seems to have undertaken),
the "legacy" module itself could be split into several modules, e.g.:
  * commons-math-legacy-exception
  * commons-math-legacy-linear
  * ...
Those would still be "legacy" because the split would primarily deal with
removing spurious dependencies, but still contain code that IMO should be
abandoned (like the whole exception "infrastructure") in non-"legacy" code.
More involved design issues (probably requiring more in-depth knowledge
of the algorithms and their various use-cases) and major tasks (refactoring
the "regression" package, for example) will be left for later.

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Posted by Alex Herbert <al...@gmail.com>.
On Sun, 23 May 2021 at 15:58, Gilles Sadowski <gi...@gmail.com> wrote:

>
> I've created a multi-module[1] version of the code base with a
> corresponding JIRA issue:
>     https://issues.apache.org/jira/browse/MATH-1575


Thanks. This is more maintainable going forward.


> The upcoming version of CM would depend on (non-beta) releases of
>   * Commons Numbers
>   * Commons Geometry
>   * Commons Statistics
>
> Any objection to have those released, and then CM v4.0, ASAP?
>

Numbers should be ready pending a few work-in-progress items.

I previously looked at Statistics when increasing the test coverage and
found nothing incorrect. It requires a more thorough review of all the
distributions to be sure. IIUC it was ported from CM so there should not be
anything wrong with the distributions given that they have been in CM for a
long time.

Matt should comment on the state of Geometry.


> [1] This will unfortunately not fix the (design and maintenance) issues
> exposed along the years.


Do you propose to release v4 to get a release out with all recent bug fixes
and then work on v5 to resolve major design issues? Or can the
design issues be isolated to packages and thus v4 would not include
packages that require a major redesign? For example I recall there are
issues with math3.stat.descriptive.moment but the attempt to move these to
Statistics in GSOC 2019 was not completed. So for example could the entire
stat package not be released in v4 and new code would be targeted to
Statistics?

Alex

Re: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Posted by Matt Juntunen <ma...@hotmail.com>.
> I've created a multi-module[1] version of the code base...

Thanks for doing that.

> Any objection to have those released, and then CM v4.0, ASAP?

+1 for releasing as soon as possible. I'm trying to wrap up the 3D Euclidean norm computation issue in NUMBERS-156 and then commons-numbers is good to go from my point of view. commons-geometry is ready to go from a feature perspective. It just needs an update to use the new Precision.DoubleEquivalence class from commons-numbers (GEOMETRY-124), a small API tweak (GEOMETRY-123), and a non-essential unit test style update (GEOMETRY-122).

Regards,
Matt J
________________________________
From: Gilles Sadowski <gi...@gmail.com>
Sent: Sunday, May 23, 2021 10:57 AM
To: Commons Developers List <de...@commons.apache.org>
Subject: [Math][Numbers][Geometry][Statistics] Road map for next release(s)

Hi.

Following recent discussions (with too few participants), no
consensus emerged about the best way to support the [Math]
component.

I've created a multi-module[1] version of the code base with a
corresponding JIRA issue:
    https://issues.apache.org/jira/browse/MATH-1575

The new layout of the [Math] maven project is in a "git" branch
named "modularized_master":
    https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/modularized_master

It already features several modules:
  * commons-math-transform
  * commons-math-neuralnet
  * commons-math-legacy

There is also
  * commons-math-examples
with "sub-modules" each with an executable application. [See also
MATH-1580]

Branch "modularized_master" is available for review.
[Help needed for the "CheckStyle" issue (MATH-1576).]

Module "commons-math-legacy" contains the codes that haven't
yet been refactored into specific functionalities in order to make it
into a dedicated module.

Functionalities that were discussed relatively recently (candidate
for modularization):
  * Genetic algorithm (in "o.a.c.math4.legacy.genetics")
  * Clustering (in "o.a.c.math4.legacy.ml.clustering")
  * Regression (in "o.a.c.math4.stat.regression")
  * Alternative to JDK "Math" class (in "o.a.c.math4.util.FastMath")
  * ...

Are people (Avijit Basak, Erik Svensson, Samy Badjoudj, ...) who
expressed interest in these areas of CM still willing to contribute?
[Please start new threads for discussing the specifics of each
candidate module.]
Module "neuralnet" can serve as a template and illustrates the
refactoring aimed at a library JAR depending Java 8 and on truly
low-level Commons components, such as [RNG] or [Numbers],
and *not* depending on the "legacy" module.

The upcoming version of CM would depend on (non-beta) releases of
  * Commons Numbers
  * Commons Geometry
  * Commons Statistics

Any objection to have those released, and then CM v4.0, ASAP?

Regards,
Gilles

[1] This will unfortunately not fix the (design and maintenance) issues
exposed along the years.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org