You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Alex Herbert <al...@gmail.com> on 2022/11/29 12:51:11 UTC

[statistics] Release v1.0

I would like to release Statistics 1.0. I have recently gone over all
the distribution implementations and the test suite. This led to some
fixes in the tests and additional test data, but no bugs were found in
the distributions except Pareto distribution sampling  which was fixed
in (Statistics-59).

I think that all outstanding Jira tasks in the distribution package
have been addressed for Statistics. The remaining Jira tickets are for
items that are not required for a first release. There are many left
open from Google Summer of Code (2019). These could be resolved or
left as markers for the type of work that is yet to be done on the
project. I would recommend closing those that contain no useful
information on future work:

GSoC 2019 progress trackers

https://issues.apache.org/jira/projects/STATISTICS/issues/STATISTICS-17
https://issues.apache.org/jira/projects/STATISTICS/issues/STATISTICS-20

GSoC 2022 unfulfilled project

https://issues.apache.org/jira/projects/STATISTICS/issues/STATISTICS-54
(Note: This is covered by Statistics-15 which summarises in a class
diagram the current stat.descriptive package.)

No use case: BigDecimalStatistics

https://issues.apache.org/jira/projects/STATISTICS/issues/STATISTICS-14

Other pre-release tasks:

1. There is a recent fix for the Pareto distribution sampler that
would require RNG 1.6 to be released. However with no other changes in
the RNG codebase I do not think this is required. The current
Statistics code fixes the bug by wrapping the input RNG.

2. Create a release doc (can be based on the numbers/rng/geometry
release docs for the similar multi-module structure).

Regards,

Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [statistics] Release v1.0

Posted by Alex Herbert <al...@gmail.com>.
On Wed, 30 Nov 2022 at 13:44, Gilles Sadowski <gi...@gmail.com> wrote:
>
> Hello Alex.

<-- cut -->

> >
> > I think that all outstanding Jira tasks in the distribution package
> > have been addressed for Statistics. The remaining Jira tickets are for
> > items that are not required for a first release. There are many left
> > open from Google Summer of Code (2019). These could be resolved or
> > left as markers for the type of work that is yet to be done on the
> > project. I would recommend closing those that contain no useful
> > information on future work:
> >
> > [...]
>
> +1

Done.

>
> >
> > Other pre-release tasks:
> >
> > 1. There is a recent fix for the Pareto distribution sampler that
> > would require RNG 1.6 to be released. However with no other changes in
> > the RNG codebase I do not think this is required. The current
> > Statistics code fixes the bug by wrapping the input RNG.
>
> So this code (in [Statistics]) will be simplified when v1.6 of [RNG]
> is out?

Yes. It would remove the need for the InvertedRNG inner class that
reverses the output from nextDouble. In the createSampler method:

final UniformRandomProvider wrappedRng = shape >= 1 ? new
InvertedRNG(rng) : rng::nextLong;
return InverseTransformParetoSampler.of(wrappedRng, scale, shape)::sample;

Would revert to the standard:

return InverseTransformParetoSampler.of(rng, scale, shape)::sample;

I could have implemented the same method from RNG in the Pareto
distribution. But this requires a method to transform random bits from
a long to a double in (0, 1]. (and not in [0, 1)). This is out of
scope for statistics so I just used 1.0 - rng.nextDouble() (with the
nextDouble ensured to return in [0, 1)).

I think a release of RNG 1.6 would thus remove 4 lines of executable
code, mainly pass through code in the inner class. The way I have done
the fix means that a release of RNG 1.6 would work exactly the same
way with Statistics 1.0 and any later version of Statistics that drops
the internal class. This is because RNG 1.6 switches from using
nextDouble to nextLong as the source of randomness. The Statistics
inner class just passes the nextLong output directly through. It only
modifies nextDouble.

So Statistics 1.0 with RNG 1.6 would naturally pick up the fix and the
improved performance of doing it correctly using bit manipulations to
adjust the double to the range (0, 1].

>
> >
> > 2. Create a release doc (can be based on the numbers/rng/geometry
> > release docs for the similar multi-module structure).
>
> That would be great.

Done.

I have checked all the javadoc and fixed a few inconsistencies. I will
prepare a RC1 for review.

Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [statistics] Release v1.0

Posted by Gilles Sadowski <gi...@gmail.com>.
Hello Alex.

Le mar. 29 nov. 2022 à 13:51, Alex Herbert <al...@gmail.com> a écrit :
>
> I would like to release Statistics 1.0. I have recently gone over all
> the distribution implementations and the test suite. This led to some
> fixes in the tests and additional test data, but no bugs were found in
> the distributions except Pareto distribution sampling  which was fixed
> in (Statistics-59).

Thanks a lot for all the work!

>
> I think that all outstanding Jira tasks in the distribution package
> have been addressed for Statistics. The remaining Jira tickets are for
> items that are not required for a first release. There are many left
> open from Google Summer of Code (2019). These could be resolved or
> left as markers for the type of work that is yet to be done on the
> project. I would recommend closing those that contain no useful
> information on future work:
>
> [...]

+1

>
> Other pre-release tasks:
>
> 1. There is a recent fix for the Pareto distribution sampler that
> would require RNG 1.6 to be released. However with no other changes in
> the RNG codebase I do not think this is required. The current
> Statistics code fixes the bug by wrapping the input RNG.

So this code (in [Statistics]) will be simplified when v1.6 of [RNG]
is out?

>
> 2. Create a release doc (can be based on the numbers/rng/geometry
> release docs for the similar multi-module structure).

That would be great.

Thanks,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org