You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2004/10/24 19:19:20 UTC

[math] Changes to distribution package

Frank has pointed out some limitations in the distribution package. 
Unfortunately, the problems require interface changes to fix, so we need 
to solve them now (i.e., before 1.0 final).  There are basically two 
problems that we need to deal with:

1) There is no way to represent a "mixed" distribution (one which is 
neither continuous nor discrete, e.g. point mass with p = 0.5 at 0, 
uniform density with p = 0.5 between 0 and 1.)

2) There is no way to represent the distribution of a discrete random 
variable that takes non-integer values.

To solve 1), we can add a ProbabilityDistribution interface like so:

public interface ProbabilityDistribution {
     /**
      * For a random variable X whose values are distributed according
      * to this distribution, this method returns P(X &le; x).  In other 
words,
      * this method represents the  (cumulative) distribution function, or
      * CDF, for this distribution.
      *
      * @param x the value at which the distribution function is evaluated.
      * @return cumulative probability that a random variable with this
      * distribution takes a value less than or equal to <code>x</code>
      * @throws MathException if the cumulative probability can not be
      * computed due to convergence or other numerical errors.
      */
     double cumulativeProbability(double x) throws MathException;

     /**
     * For a random variable X whose values are distributed according
      * to this distribution, this method returns P(x0 &le; X &le; x1).
      * <p>
      * This method should always return the same value as
      * <code>cumulativeProbability(x1) - cumulativeProbaility(x0)</code>
      *
      * @param x0 the (inclusive) lower bound
      * @param x1 the (inclusive) upper bound
      * @return the probability that a random variable with this distribution
      * will take a value between <code>x0</code> and <code>x1</code>,
      * including the endpoints
      * @throws MathException if the cumulative probability can not be
      * computed due to convergence or other numerical errors.
      * @throws IllegalArgumentException if <code>x0 > x1</code>
      */
     double cumulativeProbability(int x0, int x1) throws MathException;
}

the second method is not really necessary, but convenient. A default 
implementation could be provided in an AbstractProbabilityDistribution class.

Then the natural thing to do would be to have ContinuousDistribution and 
DiscreteDistribution interfaces extend ProbabilityDistribution and the 
Abstract*Distribution classes extend AbstractProbabilityDistribution.

To solve 2), I think we need to do something like this:

ProbabilityDistribution (as above)
|
-DiscreteDistribution (adds *only* probability(double x))
  |
  -IntegerDistribution (methods now in DiscreteDistribution)

AbstractDiscreteDistribution would become AbstractIntegerDistribution and 
would add default implementations for the inherited methods taking doubles 
as arguments. Might be tricky for probability(double), but the CDF 
functions should be OK using floors and ceils.

Any objections to these changes?  Any better ideas?

One more thing.  Frank has suggested that we introduce a Probability class 
to represent the return values of the various distribution and probability 
functions. I have been -0 to introducing this, though I understand the 
value. What do others think?

Phil


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] Changes to distribution package

Posted by Phil Steitz <ph...@steitz.com>.
I have made the changes described below, with the exception that I changed 
"ProbabilityDistribution" to "Distribution." I want to give folks an 
opportunity to see the changed code before I commit, so I have uploaded 
the new source xref and apidocs here:

http://www.apache.org/~psteitz/commons-math-1.0-RC2/

The relevant changes are in o.a.c.m.distribution. If I don't here any 
objections in the next couple of days, I commit and cut RC2.

-Phil


Phil Steitz wrote:
> Frank has pointed out some limitations in the distribution package. 
> Unfortunately, the problems require interface changes to fix, so we need 
> to solve them now (i.e., before 1.0 final).  There are basically two 
> problems that we need to deal with:
> 
> 1) There is no way to represent a "mixed" distribution (one which is 
> neither continuous nor discrete, e.g. point mass with p = 0.5 at 0, 
> uniform density with p = 0.5 between 0 and 1.)
> 
> 2) There is no way to represent the distribution of a discrete random 
> variable that takes non-integer values.
> 
> To solve 1), we can add a ProbabilityDistribution interface like so:
> 
> public interface ProbabilityDistribution {
>     /**
>      * For a random variable X whose values are distributed according
>      * to this distribution, this method returns P(X &le; x).  In other 
> words,
>      * this method represents the  (cumulative) distribution function, or
>      * CDF, for this distribution.
>      *
>      * @param x the value at which the distribution function is evaluated.
>      * @return cumulative probability that a random variable with this
>      * distribution takes a value less than or equal to <code>x</code>
>      * @throws MathException if the cumulative probability can not be
>      * computed due to convergence or other numerical errors.
>      */
>     double cumulativeProbability(double x) throws MathException;
> 
>     /**
>     * For a random variable X whose values are distributed according
>      * to this distribution, this method returns P(x0 &le; X &le; x1).
>      * <p>
>      * This method should always return the same value as
>      * <code>cumulativeProbability(x1) - cumulativeProbaility(x0)</code>
>      *
>      * @param x0 the (inclusive) lower bound
>      * @param x1 the (inclusive) upper bound
>      * @return the probability that a random variable with this 
> distribution
>      * will take a value between <code>x0</code> and <code>x1</code>,
>      * including the endpoints
>      * @throws MathException if the cumulative probability can not be
>      * computed due to convergence or other numerical errors.
>      * @throws IllegalArgumentException if <code>x0 > x1</code>
>      */
>     double cumulativeProbability(int x0, int x1) throws MathException;
> }
> 
> the second method is not really necessary, but convenient. A default 
> implementation could be provided in an AbstractProbabilityDistribution 
> class.
> 
> Then the natural thing to do would be to have ContinuousDistribution and 
> DiscreteDistribution interfaces extend ProbabilityDistribution and the 
> Abstract*Distribution classes extend AbstractProbabilityDistribution.
> 
> To solve 2), I think we need to do something like this:
> 
> ProbabilityDistribution (as above)
> |
> -DiscreteDistribution (adds *only* probability(double x))
>  |
>  -IntegerDistribution (methods now in DiscreteDistribution)
> 
> AbstractDiscreteDistribution would become AbstractIntegerDistribution 
> and would add default implementations for the inherited methods taking 
> doubles as arguments. Might be tricky for probability(double), but the 
> CDF functions should be OK using floors and ceils.
> 
> Any objections to these changes?  Any better ideas?
> 
> One more thing.  Frank has suggested that we introduce a Probability 
> class to represent the return values of the various distribution and 
> probability functions. I have been -0 to introducing this, though I 
> understand the value. What do others think?
> 
> Phil
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org