You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@commons.apache.org by ah...@apache.org on 2022/03/06 22:54:21 UTC

[commons-statistics] branch master updated (a4925db -> 2238978)

This is an automated email from the ASF dual-hosted git repository.

aherbert pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/commons-statistics.git.


    from a4925db  Correct example code in the user guide
     new 834ac00  Expand the user guide
     new e56c177  Update developer guide
     new 2238978  Update issue tracking guide

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 commons-statistics-distribution/src/site/site.xml  |   2 +-
 .../src/site/xdoc/index.xml                        |   6 +-
 .../statistics/distribution/UserGuideTest.java     | 142 ++++++++++
 src/site/site.xml                                  |   5 +-
 src/site/xdoc/developers.xml                       | 114 ++++----
 src/site/xdoc/index.xml                            |  23 +-
 src/site/xdoc/issue-tracking.xml                   |   2 +-
 src/site/xdoc/userguide/distribution.xml           |  81 ------
 src/site/xdoc/userguide/index.xml                  | 294 ++++++++++++++++++++-
 9 files changed, 520 insertions(+), 149 deletions(-)
 create mode 100644 commons-statistics-distribution/src/test/java/org/apache/commons/statistics/distribution/UserGuideTest.java
 delete mode 100644 src/site/xdoc/userguide/distribution.xml

[commons-statistics] 02/03: Update developer guide

Posted by ah...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

aherbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-statistics.git

commit e56c177acb8663cea11dc7af44a13d9736298a29
Author: Alex Herbert <ah...@apache.org>
AuthorDate: Sun Mar 6 22:33:29 2022 +0000

    Update developer guide
    
    Remove reference to building using ant.
    
    Remove reference to non-existent wiki.
    
    Add sections about GitHub PRs.
    
    Remove git config for setting core.autocrlf. The project uses a
    .gitattributes file to define this setting for the text files in the
    repository.
    
    Use https links. Fix broken/outdated links.
---
 src/site/xdoc/developers.xml | 114 ++++++++++++++++++++++---------------------
 1 file changed, 59 insertions(+), 55 deletions(-)

diff --git a/src/site/xdoc/developers.xml b/src/site/xdoc/developers.xml
index b495a88..5969191 100644
--- a/src/site/xdoc/developers.xml
+++ b/src/site/xdoc/developers.xml
@@ -55,8 +55,7 @@
             </li>
             <li>
               Like most commons components, Commons Statistics uses Apache Maven as our
-              build tool. The sources can also be built using Ant (a working
-              Ant build.xml is included in the top level project directory).
+              build tool.
               To build Commons Statistics using Maven, you can follow the instructions for
               <a href="http://maven.apache.org/run-maven/index.html">Building a
               project with Maven</a>.
@@ -68,11 +67,6 @@
               Maven.
             </li>
             <li>
-              Have a look at the new features that users and developers have requested
-              on the <a href="http://wiki.apache.org/commons/StatisticsWishList">
-              Statistics Wish List Wiki Page.</a>
-            </li>
-            <li>
               Be sure to join the commons-dev and commons-user
               <a href="mail-lists.html">
                 email lists</a> and use them appropriately (make sure the string
@@ -89,20 +83,25 @@
               directions</a>
               for submitting bugs and search the database to
               determine if an issue exists or has already been dealt with.
-              <p>
-                See the <a href="https://commons.apache.org/statistics/issue-tracking.html">
-                Commons Statistics Issue Tracking Page</a>
-                for more information on how to
-                search for or submit bugs or enhancement requests.
-              </p>
-              <li>
-                Generating patches: The requested format for generating patches is
-                the Unified Diff format, which can be easily generated using the git
-                client or various IDEs.
-                <source>git diff -p > patch </source>
-                Run this command from the top-level project directory (where pom.xml
-                resides).
-              </li>
+              <br/>
+              See the <a href="https://commons.apache.org/statistics/issue-tracking.html">
+              Commons Statistics Issue Tracking Page</a>
+              for more information on how to
+              search for or submit bugs or enhancement requests.
+            </li>
+            <li>
+              Generating patches: The requested format for generating patches is
+              the Unified Diff format, which can be easily generated using the git
+              client or various IDEs.
+              <source>git diff -p > patch </source>
+              Run this command from the top-level project directory (where pom.xml
+              resides).
+            </li>
+            <li>
+              Pull Requests: We accept pull requests (PRs) via the GitHub repository mirror.
+              The Commons Statistics repository can be forked and changes merged via a PR.
+              See <a href="https://docs.github.com/en/pull-requests/collaborating-with-pull-requests">
+              collaborating with pull requests</a> for more information on pull requests.
             </li>
           </ol>
         </p>
@@ -116,7 +115,7 @@
             <li>Start with a post to the commons-dev mailing list, with [Statistics] at
             the beginning of the subject line, followed by a short title
             describing the new feature or enhancement;  for example, "[Statistics]
-            New cryptographically secure generator".
+            New univariate distribution".
             The body of the post should include each of the following items
             (but be <strong>as brief as possible</strong>):
             <ul>
@@ -128,21 +127,32 @@
               useful</li>
             </ul></li>
             <li>Assuming a generally favorable response to the idea on commons-dev,
-            the next step is to add an entry to the
-            <a href="http://wiki.apache.org/commons/StatisticsWishList">Statistics Wish
-            List</a> corresponding to the idea.  Include a reference to the
-            discussion thread. </li>
-            <li>Create a JIRA ticket using the the feature title as the short
+            the next step is to file a report on the issue-tracking system (JIRA).
+            Create a JIRA ticket using the the feature title as the short
             description. Incorporate feedback from the initial posting in the
-            description. Add a reference to the JIRA ticket to the WishList entry.
+            description. Add a reference to the discussion thread.
             </li>
-            <li>Submit code as attachments to the JIRA ticket.  Please use one
-            ticket for each feature, adding multiple patches to the ticket
-            as necessary.  Use the git diff command to generate your patches as
-            diffs.  Please do not submit modified copies of existing java files. Be
-            patient (but not <strong>too</strong> patient) with  committers reviewing
-            patches. Post a *nudge* message to commons-dev with a reference to the
-            ticket if a patch goes more than a few days with no comment or commit.
+            <li>Submit code as:
+            <ul>
+              <li>Attachments to the JIRA ticket.  Please use one
+              ticket for each feature, adding multiple patches to the ticket
+              as necessary.  Use the git diff command to generate your patches as
+              diffs.  Please do not submit modified copies of existing java files.
+              </li>
+              <li>A pull request (PR) via GitHub.
+              To link the PR to a corresponding JIRA ticket prefix the PR title with
+              <code>STATISTICS-xxx:</code> where <code>xxx</code> is the issue number.<br/>
+              Please include quality commit messages with a single line title of about 50
+              characters, followed by a blank line, followed by a more detailed explanation
+              of the changeset. The title should be prefixed with the JIRA ticket number if
+              applicable, e.g. <code>STATISTICS-xxx: New univariate distribution</code>.
+              See <a href="https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project">
+              contributing to a project</a> in the git book for guidelines on commit messages.
+              </li>
+            </ul>
+            Be patient (but not <strong>too</strong> patient) with  committers reviewing
+            patches/PRs. Post a *nudge* message to commons-dev with a reference to the
+            ticket if a submission goes more than a few days with no comment or commit.
             </li>
           </ol>
         </p>
@@ -150,28 +160,22 @@
 
       <subsection name='Coding Style'>
         <p>
-          Commons Statistics follows <a href="http://java.sun.com/docs/codeconv/">Code
-          Conventions for the Java Programming Language</a>. As part of the maven
+          Commons Statistics follows <a href="https://www.oracle.com/java/technologies/javase/codeconventions-contents.html">
+          Code Conventions for the Java Programming Language (circa 1999)</a>. As part of the maven
           build process, style checking is performed using the Checkstyle plugin,
-          using the properties specified in <code>checkstyle.xml</code>.
+          using the properties specified in <code>checkstyle.xml</code>. This is based on
+          the default <a href="https://github.com/checkstyle/checkstyle/blob/master/src/main/resources/sun_checks.xml">
+          sun checks</a> defined by the Checkstyle plugin using current Java best practices.
           Committed code <i>should</i> generate no Checkstyle errors.  One thing
           that Checkstyle will complain about is tabs included in the source code.
           Please make sure to set your IDE or editor to use spaces instead of tabs.
         </p>
         <p>
-          Committers should configure the <source>user.name</source>,
-          <source>user.email</source> and <source>core.autocrlf</source>
-          git repository or global settings with <source>git config</source>.
-          The first two settings define the identity and mail of the committer.
-          The third setting deals with line endings to achieve consistency
-          in line endings. Windows users should configure this setting to
-          <source>true</source> (thus forcing git to convert CR/LF line endings
-          in the workspace while maintaining LF only line endings in the repository)
-          while OS X and Linux users should configure it to <source>input</source>
-          (thus forcing git to only strip accidental CR/LF when committing into
-          the repository, but never when cheking out files from the repository). See <a
-          href="http://www.git-scm.com/book/en/Customizing-Git-Git-Configuration">Customizing
-          Git - Git Configuration</a> in the git book for explanation about how to
+          Committers should configure the <code>user.name</code> and <code>user.email</code>
+          git repository or global settings with <code>git config</code>.
+          These settings define the identity and mail of the committer. See <a
+          href="https://www.git-scm.com/book/en/v2/Customizing-Git-Git-Configuration">Customizing
+          Git - Git Configuration</a> in the git book for an explanation about how to
           configure these settings and more.
         </p>
       </subsection>
@@ -191,7 +195,7 @@
           </li>
           <li>
             Commons Statistics javadoc generation supports embedded LaTeX formulas via the
-            <a href="http://www.mathjax.org">MathJax</a> javascript display engine.
+            <a href="https://www.mathjax.org">MathJax</a> javascript display engine.
             To embed mathematical expressions formatted in LaTeX in javadoc, simply surround
             the expression to be formatted with either <code>\(</code> and <code>\)</code>
             for inline formulas (or <code>\[</code> and <code>\]</code> to have the formula
@@ -252,8 +256,8 @@
           </li>
           <li>
             All contributions must comply with the terms of the Apache
-            <a href="http://www.apache.org/licenses/cla.pdf">Contributor License
-            Agreement (CLA)</a>.
+            <a href="https://www.apache.org/licenses/contributor-agreements.html#clas">
+            Contributor License Agreement (CLA)</a>.
           </li>
           <li>
             Patches <i>must</i> be accompanied by a clear reference to a "source"
@@ -266,7 +270,7 @@
             References to source materials covered by restrictive proprietary
             licenses should be avoided.  In particular, contributions should not
             implement or include references to algorithms in
-            <a href="http://www.nr.com/">Numerical Recipes (NR)</a>.
+            <a href="http://numerical.recipes/">Numerical Recipes (NR)</a>.
             Any questions about copyright or patent issues should be raised on
             the commons-dev mailing list before contributing or committing code.
           </li>

[commons-statistics] 01/03: Expand the user guide

Posted by ah...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

aherbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-statistics.git

commit 834ac00066a25105fb258c40eea19435452d9b71
Author: Alex Herbert <ah...@apache.org>
AuthorDate: Sat Mar 5 08:38:20 2022 +0000

    Expand the user guide
    
    Create a single user guide document based on the template from Commons
    Geometry.
    
    Add more entries to the site 'User Guide' menu.
    
    Added a simple test class to verify the code examples used in the user
    guide.
---
 commons-statistics-distribution/src/site/site.xml  |   2 +-
 .../src/site/xdoc/index.xml                        |   6 +-
 .../statistics/distribution/UserGuideTest.java     | 142 ++++++++++
 src/site/site.xml                                  |   5 +-
 src/site/xdoc/index.xml                            |  23 +-
 src/site/xdoc/userguide/distribution.xml           |  81 ------
 src/site/xdoc/userguide/index.xml                  | 294 ++++++++++++++++++++-
 7 files changed, 460 insertions(+), 93 deletions(-)

diff --git a/commons-statistics-distribution/src/site/site.xml b/commons-statistics-distribution/src/site/site.xml
index bef97f7..cfb6c2e 100644
--- a/commons-statistics-distribution/src/site/site.xml
+++ b/commons-statistics-distribution/src/site/site.xml
@@ -28,7 +28,7 @@
       <item name="Overview" href="index.html"/>
       <item name="Latest API docs (development)"
             href="apidocs/index.html"/>
-      <!-- 
+      <!-- TODO: Uncomment for initial release
       <item name="Javadoc (1.0 release)"
             href="https://commons.apache.org/rng/commons-rng-simple/javadocs/api-1.0/index.html"/>
       -->
diff --git a/commons-statistics-distribution/src/site/xdoc/index.xml b/commons-statistics-distribution/src/site/xdoc/index.xml
index 853cfad..55c6a20 100644
--- a/commons-statistics-distribution/src/site/xdoc/index.xml
+++ b/commons-statistics-distribution/src/site/xdoc/index.xml
@@ -20,14 +20,14 @@
 <document>
 
   <properties>
-    <title>Commons Statistics Distribution</title>
+    <title>Apache Commons Statistics Distribution</title>
   </properties>
 
   <body>
 
     <section name="Apache Commons Statistics: Distribution" href="summary">
       <p>
-        Commons Statistics provides a framework and implementations for commonly used
+        Apache Commons Statistics provides a framework and implementations for commonly used
         probability distributions.
       </p>
 
@@ -48,7 +48,7 @@
 double degreesOfFreedom = 29;
 TDistribution t = TDistribution.of(degreesOfFreedom);
 double lowerTail = t.cumulativeProbability(-2.656);   // P(T(29) &lt;= -2.656)
-double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt;= 2.75)
+double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt; 2.75)
 </source>
 
       <p>
diff --git a/commons-statistics-distribution/src/test/java/org/apache/commons/statistics/distribution/UserGuideTest.java b/commons-statistics-distribution/src/test/java/org/apache/commons/statistics/distribution/UserGuideTest.java
new file mode 100644
index 0000000..ccc0893
--- /dev/null
+++ b/commons-statistics-distribution/src/test/java/org/apache/commons/statistics/distribution/UserGuideTest.java
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.statistics.distribution;
+
+import java.util.stream.IntStream;
+import org.apache.commons.rng.UniformRandomProvider;
+import org.apache.commons.rng.simple.RandomSource;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Test code used in the distributions section of the user guide.
+ */
+class UserGuideTest {
+    @Test
+    void testCDF() {
+        TDistribution t = TDistribution.of(29);
+        double lowerTail = t.cumulativeProbability(-2.656);   // P(T(29) &lt;= -2.656)
+        double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt; 2.75)
+
+        Assertions.assertTrue(lowerTail > upperTail,
+            () -> String.format("Since 2.75 > |-2.656|, expected %s > %s", lowerTail > upperTail));
+    }
+
+    @Test
+    void testProbability() {
+        PoissonDistribution pd = PoissonDistribution.of(1.23);
+        double p1 = pd.probability(5);
+        double p2 = pd.probability(5, 5);
+        double p3 = pd.probability(4, 5);
+
+        Assertions.assertEquals(0, p2);
+        Assertions.assertEquals(p1, p3);
+    }
+
+    @Test
+    void testInverseCDF() {
+        NormalDistribution n = NormalDistribution.of(0, 1);
+        double x1 = n.inverseCumulativeProbability(1e-300);
+        double x2 = n.inverseSurvivalProbability(1e-300);
+
+        Assertions.assertEquals(x1, -x2);
+        Assertions.assertEquals(-37.0471, x1, 1e-3);
+    }
+
+    @Test
+    void testProperties() {
+        ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+        double df = chi2.getDegreesOfFreedom();    // 42
+        double mean = chi2.getMean();              // 42
+        double var = chi2.getVariance();           // 84
+
+        CauchyDistribution cauchy = CauchyDistribution.of(1.23, 4.56);
+        double location = cauchy.getLocation();    // 1.23
+        double scale = cauchy.getScale();          // 4.56
+        double undefined1 = cauchy.getMean();      // NaN
+        double undefined2 = cauchy.getVariance();  // NaN
+
+        Assertions.assertEquals(42, df);
+        Assertions.assertEquals(42, mean);
+        Assertions.assertEquals(84, var);
+        Assertions.assertEquals(1.23, location);
+        Assertions.assertEquals(4.56, scale);
+        Assertions.assertEquals(Double.NaN, undefined1);
+        Assertions.assertEquals(Double.NaN, undefined2);
+    }
+
+    @Test
+    void testDomain() {
+        BinomialDistribution b = BinomialDistribution.of(13, 0.15);
+        int lower = b.getSupportLowerBound();  // 0
+        int upper = b.getSupportUpperBound();  // 13
+
+        Assertions.assertEquals(0, lower);
+        Assertions.assertEquals(13, upper);
+    }
+
+    @Test
+    void testSampling() {
+        // From Commons RNG Simple
+        UniformRandomProvider rng = RandomSource.KISS.create(123L);
+
+        NormalDistribution n = NormalDistribution.of(0, 1);
+        double x = n.createSampler(rng).sample();
+
+        // Generate a number of samples
+        GeometricDistribution g = GeometricDistribution.of(0.75);
+        int[] k = IntStream.generate(g.createSampler(rng)::sample).limit(100).toArray();
+
+        Assertions.assertTrue(-5 < x && x < 5, () -> Double.toString(x));
+        Assertions.assertEquals(100, k.length);
+    }
+
+    @Test
+    void testComplement() {
+        ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+        double q1 = 1 - chi2.cumulativeProbability(168);
+        double q2 = chi2.survivalProbability(168);
+
+        Assertions.assertEquals(0, q1);
+        Assertions.assertNotEquals(0, q2);
+
+        // For the table
+        Assertions.assertEquals(0x1.0p-53, 1.110223e-16, 1e-3);
+        Assertions.assertEquals(0x1.0p-53, 1 - chi2.cumulativeProbability(166));
+        Assertions.assertEquals(0x1.0p-53, 1 - chi2.cumulativeProbability(167));
+        Assertions.assertEquals(0, 1 - chi2.cumulativeProbability(168));
+        Assertions.assertEquals(0, 1 - chi2.cumulativeProbability(200));
+        Assertions.assertEquals(1.16583e-16, chi2.survivalProbability(166), 1e-3);
+        Assertions.assertEquals(7.95907e-17, chi2.survivalProbability(167), 1e-3);
+        Assertions.assertEquals(5.42987e-17, chi2.survivalProbability(168), 1e-3);
+        Assertions.assertEquals(1.19056e-22, chi2.survivalProbability(200), 1e-3);
+    }
+
+    @Test
+    void testInverseComplement() {
+        ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+        double q = 5.43e-17;
+        // Incorrect: p = 1 - q == 1.0 !!!
+        double x1 = chi2.inverseCumulativeProbability(1 - q);
+        // Correct: invert q
+        double x2 = chi2.inverseSurvivalProbability(q);
+
+        Assertions.assertEquals(Double.POSITIVE_INFINITY, x1);
+        Assertions.assertEquals(168.0, x2, 0.1);
+    }
+}
diff --git a/src/site/site.xml b/src/site/site.xml
index 3305fa1..16252d8 100644
--- a/src/site/site.xml
+++ b/src/site/site.xml
@@ -44,7 +44,10 @@
     </menu>
 
     <menu name="User Guide">
-      <item name="Contents" href="/userguide/index.html"/>
+      <item name="Contents" href="/userguide/index.html#toc"/>
+      <item name="Overview" href="/userguide/index.html#overview"/>
+      <item name="Example Modules" href="/userguide/index.html#example-modules"/>
+      <item name="Probability Distributions" href="/userguide/index.html#distributions"/>
     </menu>
   </body>
 
diff --git a/src/site/xdoc/index.xml b/src/site/xdoc/index.xml
index 83399e2..3850790 100644
--- a/src/site/xdoc/index.xml
+++ b/src/site/xdoc/index.xml
@@ -20,15 +20,32 @@
 <document>
 
   <properties>
-    <title>Commons Statistics</title>
+    <title>Apache Commons Statistics</title>
   </properties>
 
   <body>
 
     <section name="Apache Commons Statistics" href="summary">
       <p>
-        Commons Statistics provides a framework and implementations for commonly used
-        probability distributions.
+        Apache Commons Statistics provides utilities for statistical applications.
+      </p>
+
+      <p>
+        Support is provided for commonly used continuous and discrete distributions,
+        for example:
+      </p>
+
+<source class="prettyprint">
+TDistribution t = TDistribution.of(29);
+double lowerTail = t.cumulativeProbability(-2.656);   // P(T(29) &lt;= -2.656)
+double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt; 2.75)
+
+PoissonDistribution p = PoissonDistribution.of(4.56);
+int x = p.inverseCumulativeProbability(0.99);
+</source>
+
+      <p>
+        For more examples and advanced usage, see the <a href="userguide/index.html">userguide</a>.
       </p>
 
     </section>
diff --git a/src/site/xdoc/userguide/distribution.xml b/src/site/xdoc/userguide/distribution.xml
deleted file mode 100644
index 9e9d437..0000000
--- a/src/site/xdoc/userguide/distribution.xml
+++ /dev/null
@@ -1,81 +0,0 @@
-<?xml version="1.0"?>
-
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-  contributor license agreements.  See the NOTICE file distributed with
-  this work for additional information regarding copyright ownership.
-  The ASF licenses this file to You under the Apache License, Version 2.0
-  (the "License"); you may not use this file except in compliance with
-  the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-  -->
-
-<?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
-<document url="distribution.html">
-  <properties>
-    <title>The Commons Math User Guide - Distributions</title>
-  </properties>
-  <body>
-    <section name="1 Probability Distributions">
-      <subsection name="1.1 Overview" href="overview">
-        <p>
-          The distributions package provides a framework and implementations for some commonly used
-          probability distributions. Continuous univariate distributions are represented by implementations of
-          the <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/ContinuousDistribution.html">ContinuousDistribution</a>
-          interface.  Discrete distributions implement
-          <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/DiscreteDistribution.html">DiscreteDistribution</a>
-          (values must be mapped to integers).
-        </p>
-      </subsection>
-      <subsection name="1.2 API" href="distributions">
-        <p>
-          The distribution framework provides the means to compute probability density,
-          probability mass and cumulative probability functions for several well-known
-          discrete (integer-valued) and continuous probability distributions.
-          The API also allows for the computation of inverse cumulative probabilities
-          and sampling from distributions.
-        </p>
-        <p>
-          For an instance <code>f</code> of a distribution <code>F</code>,
-          and a domain value, <code>x</code>, <code>f.cumulativeProbability(x)</code>
-          computes <code>P(X &lt;= x)</code> where <code>X</code> is a random variable distributed
-          as <code>F</code>.
-          For <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/DiscreteDistribution.html">discrete</a>
-          <code>F</code>, the probability mass function is given by <code>f.probability(x)</code>.
-          For <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/ContinuousDistribution.html">continuous</a>
-          <code>F</code>, the probability density function is given by <code>f.density(x)</code>.
-          Continuous distributions also implement <code>f.probability(x1, x2)</code> for computing
-          <code>P(x1 &lt;= X &lt;= x2)</code>.
-        </p>
-<source class="prettyprint">TDistribution t = TDistribution.of(29);
-double lowerTail = t.cumulativeProbability(-2.656);   // P(T(29) &lt;= -2.656)
-double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt;= 2.75)</source>
-        <p>
-          All distributions implement a <code>createSampler(UniformRandomProvider rng)</code>
-          method to support random sampling from the distribution, where <code>UniformRandomProvider</code>
-          is an interface defined in <a href="https://commons.apache.org/rng">Commons RNG</a>.
-        </p>
-        <p>
-          Inverse distribution functions can be computed using the
-          <code>inverseCumulativeProbability</code> methods.  For continuous <code>f</code>
-          and <code>p</code> a probability, <code>f.inverseCumulativeProbability(p)</code> returns
-          <ul>
-            <li><code>inf{x in R | P(X &le; x) &ge; p} for 0 &lt; p &lt; 1},</code></li>
-            <li><code>inf{x in R | P(X &le; x) &gt; 0} for p = 0}.</code></li>
-          </ul>
-          where <code>X</code> is distributed as <code>F</code>.<br/>
-          For discrete <code>F</code>, the definition is the same, with <code>Z</code> (the integers)
-          in place of <code>R</code> (but note that, in the discrete case, the &ge; in the definition
-          can make a difference when <code>p</code> is an attained value of the distribution).
-        </p>
-      </subsection>
-    </section>
-  </body>
-</document>
diff --git a/src/site/xdoc/userguide/index.xml b/src/site/xdoc/userguide/index.xml
index a845ec0..51ff115 100644
--- a/src/site/xdoc/userguide/index.xml
+++ b/src/site/xdoc/userguide/index.xml
@@ -20,20 +20,306 @@
 <?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
 <document url="index.html">
   <properties>
-    <title>The Commons Statistics User Guide - Table of Contents</title>
+    <title>Apache Commons Statistics User Guide</title>
   </properties>
 
   <body>
-    <section name="Table of Contents" href="toc">
 
+    <h1>Apache Commons Statistics User Guide</h1>
+    <section name="Contents" id="toc">
       <ul>
+        <li>
+          <a href="#overview">Overview</a>
+        </li>
+        <li>
+          <a href="#example-modules">Example Modules</a>
+        </li>
+        <li>
+          <a href="#distributions">Probability Distributions</a>
+          <ul>
+            <li>
+              <a href="#dist_overview">Overview</a>
+            </li>
+            <li>
+              <a href="#dist_api">API</a>
+            </li>
+            <li>
+              <a href="#dist_imp_details">Implementation Details</a>
+            </li>
+            <li>
+              <a href="#dist_complements">Complementary Probabilities</a>
+            </li>
+          </ul>
+        </li>
+
+      </ul>
+    </section>
 
+    <section name="Overview" id="overview">
+      <p>
+        Apache Commons Statistics provides utilities for statistical applications. The code
+        originated in the <code><a href="https://commons.apache.org/proper/commons-math/">
+        commons-math</a></code> project but was pulled out into a separate project for better
+        maintainability and has since undergone numerous improvements.
+      </p>
+
+      <p>
+        Commons Statistics is divided into a number of submodules.
+      </p>
+      <ul>
         <li>
-          <a href="distribution.html">
-          1. Distributions</a>
+          <code><a href="../commons-statistics-distribution/index.html">
+          commons-statistics-distribution</a></code> - Provides interfaces
+          and classes for probability distributions.
         </li>
       </ul>
+    </section>
+
+    <section name="Example Modules" id="example-modules">
+      <p>
+        In addition to the modules above, the Commons Statistics
+        <a href="https://commons.apache.org/statistics/download_statistics.cgi">source distribution</a>
+        contains example code demonstrating library functionality and/or providing useful
+        development utilities. These modules are not part of the public API of the library and no
+        guarantees are made concerning backwards compatibility. The
+        <a href="../commons-statistics-examples/modules.html">example module parent page</a>
+        contains a listing of available modules.
+      </p>
+    </section>
+
+    <section name="Probability Distributions" id="distributions">
+      <subsection name="Overview" id="dist_overview">
+        <p>
+          The distributions package provides a framework and implementations for some commonly used
+          probability distributions. Continuous univariate distributions are represented by
+          implementations of the
+          <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/ContinuousDistribution.html">ContinuousDistribution</a>
+          interface.  Discrete distributions implement
+          <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/DiscreteDistribution.html">DiscreteDistribution</a>
+          (values must be mapped to integers).
+        </p>
+      </subsection>
+      <subsection name="API" id="dist_api">
+        <p>
+          The distribution framework provides the means to compute probability density,
+          probability mass and cumulative probability functions for several well-known
+          discrete (integer-valued) and continuous probability distributions.
+          The API also allows for the computation of inverse cumulative probabilities
+          and sampling from distributions.
+        </p>
+        <p>
+          For an instance <code>f</code> of a distribution <code>F</code>,
+          and a domain value, <code>x</code>, <code>f.cumulativeProbability(x)</code>
+          computes <code>P(X &lt;= x)</code> where <code>X</code> is a random variable distributed
+          as <code>F</code>. The complement of the cumulative probability,
+          <code>f.survivalProbability(x)</code> computes <code>P(X &gt; x)</code>. Note that
+          the survival probability is approximately equal to <code>1 - P(X &lt;= x)</code> but
+          does not suffer from cancellation error as the cumulative probability approaches 1.
+          The cancellation error may cause a (total) loss of accuracy when
+          <code>P(X &lt;= x) ~ 1</code>
+          (see <a href="#complements">complementary probabilities</a>).
+        </p>
+<source class="prettyprint">
+TDistribution t = TDistribution.of(29);
+double lowerTail = t.cumulativeProbability(-2.656);   // P(T(29) &lt;= -2.656)
+double upperTail = t.survivalProbability(2.75);       // P(T(29) &gt; 2.75)
+</source>
+        <p>
+          For <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/DiscreteDistribution.html">discrete</a>
+          <code>F</code>, the probability mass function is given by <code>f.probability(x)</code>.
+          For <a href="../commons-statistics-distribution/apidocs/org/apache/commons/statistics/distribution/ContinuousDistribution.html">continuous</a>
+          <code>F</code>, the probability density function is given by <code>f.density(x)</code>.
+          Distributions also implement <code>f.probability(x1, x2)</code> for computing
+          <code>P(x1 &lt;= X &lt;= x2)</code> for continuous or <code>P(x1 &lt; X &lt;= x2)</code>
+          for discrete distributions.
+        </p>
+<source class="prettyprint">
+PoissonDistribution pd = PoissonDistribution.of(1.23);
+double p1 = pd.probability(5);
+double p2 = pd.probability(5, 5);
+double p3 = pd.probability(4, 5);
+// p2 == 0
+// p1 == p3
+</source>
+        <p>
+          Inverse distribution functions can be computed using the
+          <code>inverseCumulativeProbability</code> and <code>inverseSurvivalProbability</code>
+          methods. For continuous <code>f</code> and <code>p</code> a probability,
+          <code>f.inverseCumulativeProbability(p)</code> returns
+        </p>
+        <ul>
+          <li><code>inf{x in R | P(X &le; x) &ge; p} for 0 &lt; p &le; 1</code>,</li>
+          <li><code>inf{x in R | P(X &le; x) &gt; 0} for p = 0</code></li>
+        </ul>
+        <p>
+          where <code>X</code> is distributed as <code>F</code>.<br/>
+          Likewise <code>f.inverseSurvivalProbability(p)</code> returns
+        </p>
+        <ul>
+          <li><code>inf{x in R | P(X &ge; x) &le; p} for 0 &le; p &lt; 1</code>,</li>
+          <li><code>inf{x in R | P(X &ge; x) &lt; 1} for p = 1</code>.</li>
+        </ul>
+<source class="prettyprint">
+NormalDistribution n = NormalDistribution.of(0, 1);
+double x1 = n.inverseCumulativeProbability(1e-300);
+double x2 = n.inverseSurvivalProbability(1e-300);
+// x1 == -x2 ~ -37.0471
+</source>
+        <p>
+          For discrete <code>F</code>, the definition is the same, with <code>Z</code>
+          (the integers) in place of <code>R</code> (but note that, in the discrete case,
+          the &ge; in the definition can make a difference when <code>p</code> is an attained
+          svalue of the distribution).
+        </p>
+        <p>
+          All distributions provide accessors for the parameters used to create the distribution,
+          and a mean and variance. The return value when the mean or variance
+          is undefined is noted in the class javadoc.
+        </p>
+<source class="prettyprint">
+ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+double df = chi2.getDegreesOfFreedom();    // 42
+double mean = chi2.getMean();              // 42
+double var = chi2.getVariance();           // 84
+
+CauchyDistribution cauchy = CauchyDistribution.of(1.23, 4.56);
+double location = cauchy.getLocation();    // 1.23
+double scale = cauchy.getScale();          // 4.56
+double undefined1 = cauchy.getMean();      // NaN
+double undefined2 = cauchy.getVariance();  // NaN
+</source>
+        <p>
+          The supported domain of the distribution is provided by the
+          <code>getSupportLowerBound</code> and <code>getSupportUpperBound</code> methods.
+        </p>
+<source class="prettyprint">
+BinomialDistribution b = BinomialDistribution.of(13, 0.15);
+int lower = b.getSupportLowerBound();  // 0
+int upper = b.getSupportUpperBound();  // 13
+</source>
+        <p>
+          All distributions implement a <code>createSampler(UniformRandomProvider rng)</code>
+          method to support random sampling from the distribution, where <code>UniformRandomProvider</code>
+          is an interface defined in <a href="https://commons.apache.org/rng">Commons RNG</a>.
+          The sampler is a functional interface with a single <code>sample()</code> method
+          suitable for use as a <code>DoubleSupplier</code> or <code>IntSupplier</code> to
+          generate samples.
+        </p>
+<source class="prettyprint">
+// From Commons RNG Simple
+UniformRandomProvider rng = RandomSource.KISS.create(123L);
+
+NormalDistribution n = NormalDistribution.of(0, 1);
+double x = n.createSampler(rng).sample();
 
+// Generate a number of samples
+GeometricDistribution g = GeometricDistribution.of(0.75);
+int[] k = IntStream.generate(g.createSampler(rng)::sample).limit(100).toArray();
+// k.length == 100
+</source>
+        <p>
+          Note that even when distributions are immutable, the sampler is not immutable as it
+          depends on the instance of the mutable <code>UniformRandomProvider</code>. Generation of
+          many samples in a multi-threaded application should use a separate instance of
+          <code>UniformRandomProvider</code> per thread. Any synchronization should be avoided
+          for best performance.
+        </p>
+      </subsection>
+      <subsection name="Implementation Details" id="dist_imp_details">
+        <p>
+          Instances are constructed using factory methods, typically a static method in the
+          distribution class named <code>of</code>. This allows the returned instance
+          to be specialised to the distribution parameters.
+        </p>
+        <p>
+          Exceptions will be raised by the factory method when constructing the distribution
+          using invalid parameters. See the class javadoc for exception conditions.
+        </p>
+        <p>
+          Unless otherwise noted, distribution instances are immutable. This allows sharing
+          an instance between threads for computations.
+        </p>
+        <p>
+          Exceptions will not be raised by distributions for an invalid <code>x</code> argument
+          to probability functions. Typically the cumulative probability functions will return
+          0 or 1 for an out-of-domain argument, depending on which the side of the domain bound
+          the argument falls, and the density or probability mass functions return 0.
+          Return values for <code>x</code> arguments when the result is
+          undefined should be documented in the class javadoc. For example the beta distribution
+          is undefined for <code>x = 0, alpha &lt; 1</code> or <code>x = 1, beta &lt; 1</code>.
+          Note: This out-of-domain behaviour may be different from distributions in the
+          <code>org.apache.commons.math3.distribution</code> package. Users upgrading from
+          <code><a href="https://commons.apache.org/proper/commons-math/">commons-math</a></code>
+          should check the appropriate class javadoc.
+        </p>
+        <p>
+          An exception will be raised by distributions for an invalid <code>p</code> argument
+          to inverse probability functions. The argument must be in the range <code>[0, 1]</code>.
+        </p>
+      </subsection>
+      <subsection name="Complementary Probabilities" id="dist_complements">
+        <p>
+          The distributions provide the cumulative probability <code>p</code> and its complement,
+          the survival probability, <code>q = 1 - p</code>. When the probability
+          <code>q</code> is small use of the cumulative probability to compute <code>q</code> can
+          result in dramatic loss of accuracy. This is due to the distribution of floating-point
+          numbers having a
+          <a href="https://en.wikipedia.org/wiki/Reciprocal_distribution">log-uniform</a>
+          distribution as the limiting distribution. There are far more
+          representable numbers as the probability value approaches zero than when it approaches
+          one.
+        </p>
+        <p>
+          The difference is illustrated with the result of computing the upper tail of a
+          probability distribution.
+        </p>
+<source class="prettyprint">
+ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+double q1 = 1 - chi2.cumulativeProbability(168);
+double q2 = chi2.survivalProbability(168);
+// q1 == 0
+// q2 != 0
+</source>
+        <p>
+          In this case the value <code>1 - p</code> has only a single bit of information as
+          <code>x</code> approaches 168. For example the value <code>1 - p(x=167)</code>
+          is <code>2<sup>-53</sup></code> (or approximately <code>1.11e-16</code>).
+          The complement <code>q</code> retains information
+          much further into the long tail as shown in the following table:
+        </p>
+        <table border="1" style="width: auto">
+          <tr><th colspan="3"><font size="+1">Chi-squared distribution, 42 degrees of freedom</font></th></tr>
+          <tr><th>x</th><th>1 - p</th><th>q</th></tr>
+          <tr><td>166</td><td>1.11e-16</td><td>1.16e-16</td></tr>
+          <tr><td>167</td><td>1.11e-16</td><td>7.96e-17</td></tr>
+          <tr><td>168</td><td>0</td><td>5.43e-17</td></tr>
+          <tr><td>200</td><td>0</td><td>1.91e-22</td></tr>
+        </table>
+        <p>
+          Probability computations should use the appropriate cumulative or survival function
+          to calculate the lower or upper tail resepectively. The same care should be applied
+          when inverting probability distributions. It is preferred to compute either
+          <code>p &le; 0.5</code> or <code>q &le; 0.5</code> without loss of accuracy and then
+          invert respectively the cumulative probability using <code>p</code> or the survival
+          probabilty using <code>q</code> to obtain <code>x</code>.
+        </p>
+<source class="prettyprint">
+ChiSquaredDistribution chi2 = ChiSquaredDistribution.of(42);
+double q = 5.43e-17;
+// Incorrect: p = 1 - q == 1.0 !!!
+double x1 = chi2.inverseCumulativeProbability(1 - q);
+// Correct: invert q
+double x2 = chi2.inverseSurvivalProbability(q);
+// x1 == +infinity
+// x2 ~ 168.0
+</source>
+        <p>
+          Note: The survival probability functions were not present in the
+          <code>org.apache.commons.math3.distribution</code> package. Users upgrading from
+          <code><a href="https://commons.apache.org/proper/commons-math/">commons-math</a></code>
+          should update usage of the cumulative probability functions where appropirate.
+        </p>
+      </subsection>
     </section>
 
   </body>

[commons-statistics] 03/03: Update issue tracking guide

Posted by ah...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

aherbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-statistics.git

commit 2238978d7f16272eb6c9b1ce61b2c12fa186b6de
Author: Alex Herbert <ah...@apache.org>
AuthorDate: Sun Mar 6 22:36:41 2022 +0000

    Update issue tracking guide
    
    Replace reference to subversion with git.
---
 src/site/xdoc/issue-tracking.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/site/xdoc/issue-tracking.xml b/src/site/xdoc/issue-tracking.xml
index affdff5..8295b45 100644
--- a/src/site/xdoc/issue-tracking.xml
+++ b/src/site/xdoc/issue-tracking.xml
@@ -85,7 +85,7 @@ limitations under the License.
       </p>
 
       <p>
-      For more information on subversion and creating patches see the
+      For more information on git and creating patches see the
       <a href="http://www.apache.org/dev/contributors.html">Apache Contributors Guide</a>.
       </p>