You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Christian Winter (Created) (JIRA)" <ji...@apache.org> on 2011/10/18 20:03:10 UTC

[jira] [Created] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Cumulative probability and inverse cumulative probability inconsistencies
-------------------------------------------------------------------------

                 Key: MATH-692
                 URL: https://issues.apache.org/jira/browse/MATH-692
             Project: Commons Math
          Issue Type: Bug
    Affects Versions: 2.2, 2.1, 2.0, 1.2, 1.1, 1.0, 1.3, 2.2.1, 3.0
            Reporter: Christian Winter
             Fix For: 3.0


There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.

Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.

On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:

Distribution:
- cumulativeProbability(double x): returns P(X <= x)
- cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
- inverseCumulativeProbability(double p):
  returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]

1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
- This definition simplifies inverse transform sampling (as mentioned in the other issue).
- It is the standard textbook definition for the quantile function.
- For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.

ContinuousDistribution:
nothing to be added regarding (inverse) cumulative probability functions

IntegerDistribution:
- cumulativeProbability(int x): returns P(X <= x)
- cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138795#comment-13138795 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Hi Sébastien,

the problem with the plateau is indeed one issue which needs to be solved.

Additionally, AbstractDistribution will need an implementation of inverseCumulativeProbability. In fact both implementations should be the same except for the solver to be used. Thus inverseCumulativeProbability should be implemented just once in AbstractDistribution, and invoking the solver should be put to a separate procedure so that it can be overridden in AbstractContinuousDistribution.

A third point is the choice of the solvers. For AbstractDistribution we need a solver which works even for discontinuous cdfs (BisectionSolver can do the job, but maybe the implementations of the faster IllinoisSolver, PegasusSolver, BrentSolver, or another solver can cope with discontinuities, too). For AbstractContinuousDistribution it would be beneficial to use a DifferentiableUnivariateRealSolver. However, the NewtonSolver cannot be used due to uncertainty of convergence and an alternative doesn't seem to exist by now. So we have to choose one of the other solvers for now.

As all these points are interdependent, I guess it's best to solve them as a whole. If you like, you can do this.

Best Regards,
Christian
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147115#comment-13147115 ] 

Phil Steitz commented on MATH-692:
----------------------------------

Hey, that's how it always works :)  

I don't know about "wise" but I certainly qualify as "old" by any standard, so will have a look once you have reviewed and committed.

Thanks!
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131245#comment-13131245 ] 

Phil Steitz commented on MATH-692:
----------------------------------

Thanks for raising this issue, Christian - especially now as we finalize the 3.0 API.

I am +1 for these changes.  I agree that the inf-based definition of inverse cum is more standard and we are in a position now make the change, so I say lets do it.  I am also +1 on the move of this up to the distribution interface.  The reason we did not include it there originally was that we thought we might implement distributions for which we could not define inverses.  That has not happened in the last 8 years, so I think its safe enough to push it up.

The code, test, user guide and doc changes for this have to be done carefully.  Patches most welcome.

Is everyone else OK with this change?
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133594#comment-13133594 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Happy to help on the inverse cumulative probability. You will have to be patient and forgieving with me, though, as I discover this part of CM.

As for the definition, I think that one of the bounds should be excluded, so that these cumulative probabilities can be summed
P(a < X <= c) = P(a < X <= b) + P(b < X <= c),
even in the case of discrete PDFs.

Whether the lower or upper bound should be excluded is another matter. I usually work with continuous pdfs, so I don't know if there is a common practice in the probability community. If there is none, I would tend to chose the following definition
P(x0 <= X < x1)
(sorry Phil!), because it would be consistent with the way things are usually indexed in java (a[0].. a[a.length-1]). See also {{org.apache.commons.math.util.MultidimensionalCounter}}. Although this type of consistency is not an absolute requirement, I think it is nice for the user to have such simple principle: "lower bound always included, upper bound always excluded". Appart from this small point, I really have no objection to any choice.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13136853#comment-13136853 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Hi Christian,
I've started looking into this issue. As I said, you will have to be patient with me ;).
I can see there already is a default implementation of {{AbstractContinuousDistribution.inverseCumulativeProbability}}. So what exactly would you like me to do? Is this implementation fragile? Would you like me to improve robustness? Provide full testing?

I think there might be issues when the PDF falls down to zero in a range (in which case the cum exhibits a plateau). The returned value might differ from the mathematical definition you proposed. Is this what you want me to work on? Have you already identified other issues?

Best regards,
Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173481#comment-13173481 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Hi Christian,
thanks for this contribution. I am away for a few days, but am very happy to commit this patch as soon as I am back, if you are not in too much of a hurry.
Thanks again,
Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133718#comment-13133718 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Good, the definition of cum(x0,x1) will be P(x0 < X <= x1). Phil, you are right: cum(x0,x1) in AbstractDistribution was a reason for raising this issue. Another reason was cum(int x0, int x1) in AbstractIntegerDistribution.

The idea behind probability(double x) is in fact to define it as p(X = x) and to return 0 for continuous distributions. This function would be useful for discrete distributions not inheriting from IntergerDistribution and for distributions being composed of discrete and continuous parts.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Updated JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sébastien Brisard updated MATH-692:
-----------------------------------

    Comment: was deleted

(was: The committed patch actually causes failure of {{Well1024Test}} in {{o.a.c.m.random}}.)
    
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Winter updated MATH-692:
----------------------------------

    Attachment: MATH-692_integerDomain_patch1.patch

This is the patch which adjusts the integer distributions to the agreements above.

The changes to the test cases for the random generators may be unexpected. But these changes initially were triggered by adjusting {{RandomDataTest.checkNextPoissonConsistency(double)}} to the new contract for integer distributions. Then some random generator tests failed due to chance. While adjusting their seeds, I found some other tests with a high failure probability. Thus I also set some failure probabilities to 0.01 in order to find suitable seeds more quickly.

My next task on this issue is to adjust the user guid.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Winter updated MATH-692:
----------------------------------

    Attachment: Math-692_realDomain_patch1.patch

Here is the first patch for this issue (unfortunately with some delay). It adjusts the distributions with real domain to the definitions in this issue, and it mainly changes documentations.

I could not move inverseCumulativeProbability(double) up to Distribution because there would be a conflict with IntegerDistribution.inverseCumulativeProbability(double): This method returns int. This problem will be removed by solving issue MATH-703.

The implementation of inverseCumulativeProbability(double) is not changed as Sébastien is working on this.

I will provide the patch for the integer distributions as soon as I have adjusted the test data to the new inequalities and reverified the adjusted test data.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144630#comment-13144630 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Please note that MATH-699 has been created specifically to handle plateaux.

Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178048#comment-13178048 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

{quote}
I see you already changed the seed in Well1024aTest.
{quote}

Yes I did, but is this really how we want {{Well2004aTest}} to pass?
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177903#comment-13177903 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Well, we've recently run into some troubles with SVN, but it seems everything is working fine again. Patch {{MATH-692_integerDomain_patch1.patch}} (with minor checkstyle changes) committed in revision {{1226041}}.

Please do not forget to run {{mvn clean; mvn site:site}} and check the reports (in particular, {{checkstyle}}) prior to submitting a patch!

Thanks for this contribution.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177915#comment-13177915 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

The committed patch actually causes failure of {{Well1024Test}} in {{o.a.c.m.random}}.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162487#comment-13162487 ] 

Christian Winter commented on MATH-692:
---------------------------------------

As mentioned by Sébastien in MATH-699, the implementation of {{IntegerDistribution.inverseCumulativeProbability(double p)}} can benefit from the ideas which came up for {{RealDistribution.inverseCumulativeProbability(double p)}} in that thread.

Thus I will remove {{getDomainLowerBound(double p)}} and {{getDomainUpperBound(double p)}} from the integer distributions. I checked that all current implementations of the lower/upper bound methods provide the whole support of the distribution as starting bracket. This means that using {{getSupportLowerBound()}} and {{getSupportUpperBound()}} for the starting bracket won't degrade the performance of the current distribution implementations. However, a user might want the improve the performance of his distribution implementations by providing a more targeted starting bracket for probability {{p}}. Thus I will swap the solving step to a protected function {{solveInverseCumulativeProbability(double p, int lower, int upper)}}, so that it gets easy to override {{inverseCumulativeProbability}} with an implementation which finds a better starting bracket.

Furthermore, Phil's idea with Chebyshev's inequality can be applied to the generic implementation of {{inverseCumulativeProbability}} in order to get a better starting bracket.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138835#comment-13138835 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Another point for discussion:
I'd like to introduce
getDomainBracket(double p): returns double[]
to AbstractDistribution as helper function for inverseCumulativeProbability. This allows to avoid searching a bracket where a bracket can be specified directly.
The function getDomainBracket could be made abstract (which means to remove getInitialDomain, getDomainLowerBound, and getDomainUpperBound as these functions aren't needed any more), or it could have a default implementation (according to the corresponding part of the current implementation of inverseCumulativeProbability) which uses getInitialDomain, getDomainLowerBound, and getDomainUpperBound. However, getInitialDomain, getDomainLowerBound, and getDomainUpperBound should not be abstract in the latter case. Otherwise a derived class would be forced to implement something it potentially doesn't use. Thus the functions getInitialDomain, getDomainLowerBound, and getDomainUpperBound should have default implementations which either return default values (0, -infinity, +infinity) or throw an exception saying something like "has to be implemented".
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147514#comment-13147514 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Patch {{Math-692_realDomain_patch1.patch}} (20111108) applied in rev 1200179, with minor modifications (mostly checkstyle fixes).
Thanks Christian!

                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Issue Comment Edited] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133665#comment-13133665 ] 

Phil Steitz edited comment on MATH-692 at 10/23/11 3:23 PM:
------------------------------------------------------------

Have a look at the default implementation of cum(x0,x1) now in AbstractDistribution.  I think the incorrectness in the documentation there may have been what triggered Christian to raise this issue.  The equation cum(a,b) = F(b) - F(a) where F is the distribution function is natural and what the impl there is trying to do.  In the discrete case, this equation fails, however, unless you define the cum to exclude the *lower* endpoint.  That's why P(x0 < X <= x1) is a better definition.
                
      was (Author: psteitz):
    Have a look at the default implementation of cum(-,-) now in AbstractDistribution.  I think the incorrectness in the documentation there may have been what triggered Christian to raise this issue.  The equation cum(a,b) = F(b) - F(a) where F is the distribution function is natural and what the impl there is trying to do.  In the discrete case, this equation fails, however, unless you define the cum to exclude the *lower* endpoint.  That's why P(x0 < X <= x1) is a better definition.
                  
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Winter updated MATH-692:
----------------------------------

    Priority: Minor  (was: Major)
    
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131402#comment-13131402 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

I have neither used nor developed this part of CM, so my view on this is of but little value. Having said that, anything improving consistency can only be desirable, especially at this stage. So I'm all for it, and will be soon available (when I'm done on SYMMLQ) for an (novice on these issues) help.

Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133665#comment-13133665 ] 

Phil Steitz commented on MATH-692:
----------------------------------

Have a look at the default implementation of cum(-,-) now in AbstractDistribution.  I think the incorrectness in the documentation there may have been what triggered Christian to raise this issue.  The equation cum(a,b) = F(b) - F(a) where F is the distribution function is natural and what the impl there is trying to do.  In the discrete case, this equation fails, however, unless you define the cum to exclude the *lower* endpoint.  That's why P(x0 < X <= x1) is a better definition.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133464#comment-13133464 ] 

Phil Steitz commented on MATH-692:
----------------------------------

I am not sure it is really makes sense to add probability(double x) to the Distribution interface.  It would have to be defined as density (referring to the distribution function) to make sense in the continuous case, since defined as p(X = x) it would in most cases be identically 0 for continuous distributions.

Regarding the cum definition, I am fine with P(x0 < X <= x1).
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139105#comment-13139105 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Hi Christian,

{quote}
Hi Sébastien,

the problem with the plateau is indeed one issue which needs to be solved.
{quote}
I'm working on it...

{quote}
Additionally, AbstractDistribution will need an implementation of inverseCumulativeProbability. In fact both implementations should be the same except for the solver to be used. Thus inverseCumulativeProbability should be implemented just once in AbstractDistribution, and invoking the solver should be put to a separate procedure so that it can be overridden in AbstractContinuousDistribution.
{quote}
OK, for now, I'm concentrating on making the current impl in {{AbstractContinuousDistribution}} more robust. The other impl should be easier.

{quote}
A third point is the choice of the solvers. For AbstractDistribution we need a solver which works even for discontinuous cdfs (BisectionSolver can do the job, but maybe the implementations of the faster IllinoisSolver, PegasusSolver, BrentSolver, or another solver can cope with discontinuities, too). For AbstractContinuousDistribution it would be beneficial to use a DifferentiableUnivariateRealSolver. However, the NewtonSolver cannot be used due to uncertainty of convergence and an alternative doesn't seem to exist by now. So we have to choose one of the other solvers for now.
{quote}
The current implementation uses a Brent solver. I think the solver itself is only one side of the issue. The other point is the algorithm used to bracket the solution, in order to ensure that the result is consistent with the definition of the cumprob. As for the {{DifferentiableUnivariateRealSolver}}, I'm not too sure. I guess it depends on what is meant by "continuous distribution". For me, it means that the random variable takes values in a continuous set, and possibly its distribution is defined by a density. However, in my view, nothing prevents occurences of Dirac functions, in which case the cum sum is only piecewise C1. It's all a matter of definition, of course, and I'll ask the question on the forum to check whether or not people want to allow for such a situation.

{quote}
As all these points are interdependent, I guess it's best to solve them as a whole. If you like, you can do this.

Best Regards,
Christian
{quote}
Yes, I'm very interested.

Best regards,
Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178047#comment-13178047 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Thanks for committing the patch, Sébastien. I see you already changed the seed in {{Well1024aTest}}. This hopefully removes the failure.

I'll have a look into Maven to prepare a better patch next time. :-)
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178508#comment-13178508 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

{quote}
Thus difficulties in finding a suitable seed are an indicator for problems in the code.
{quote}

That's exactly the point I've raised on the mailing-list: out of three seeds (100, 1000 and 1001), only one works. Of course, I would not dare to call that representative statistics, but I'm wondering whether or not we should be worried...
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Phil Steitz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133745#comment-13133745 ] 

Phil Steitz commented on MATH-692:
----------------------------------

I guess I am OK with pushing p(x) up.  See related post to follow in commons-dev. 
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133668#comment-13133668 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

OK, Phil, it makes perfect sense.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133440#comment-13133440 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Thanks for the feedback to all. Sébastien, thanks for offering your help. If you like and find time for it, you could implement AbstractDistribution.inverseCumulativeProbability(double p).

I will provide some patches next week, but adjusting AbstractContinuousDistribution.inverseCumulativeProbability(double p) will take some more time.

After thinking a little more about the structure of the interfaces, I'd like to put the function probability(double x) to Distribution anyway (independently of the thought in point 1) above).

Are there any preferences on P(x0 <= X <= x1) or P(x0 < X <= x1) for cumulativeProbability(double x0, double x1)?
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170588#comment-13170588 ] 

Christian Winter commented on MATH-692:
---------------------------------------

Hi Sébastien,

my changes in the integer distributions don't solve MATH-718. Instead I found a probably related problem with the Pascal distribution.

The integer distribution patch for this issue still isn't ready. I will provide it next week.

Christian
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Mikkel Meyer Andersen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131407#comment-13131407 ] 

Mikkel Meyer Andersen commented on MATH-692:
--------------------------------------------

+1
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162639#comment-13162639 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

Hi Christian,
If you agree with that, I suggest that you also take care of MATH-718, as the two issues seem to be very much connected.
Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Resolved] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Resolved JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sébastien Brisard resolved MATH-692.
------------------------------------

    Resolution: Fixed

The issue about selection of an appropriate seed has been raised elsewhere. No definitive answer has been provided so far, so I suggest we consider this issue as solved for the time being.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Christian Winter (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178481#comment-13178481 ] 

Christian Winter commented on MATH-692:
---------------------------------------

I guess there is no alternative to this way of making probabilistic test cases pass. However, I understand your bad feeling with this kind of failure fixing. The problem is that probabilistic tests are quiet fuzzy: Neither a passed test nor a failed test provides a clear answer whether something is right or wrong in the implementation. There is just a high chance to pass such a test with a correct implementation. The chance for failure increases with an erroneous implementation due to systematic deviations in the generated data. These chances tell whether it is easy to find a seed which passes the tests or not. Thus difficulties in finding a suitable seed are an indicator for problems in the code.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177916#comment-13177916 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

The committed patch actually causes failure of {{Well1024Test}} in {{o.a.c.m.random}}.
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: MATH-692_integerDomain_patch1.patch, Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MATH-692) Cumulative probability and inverse cumulative probability inconsistencies

Posted by "Sébastien Brisard (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146823#comment-13146823 ] 

Sébastien Brisard commented on MATH-692:
----------------------------------------

All,
since I'm already working on this package, I'm happy to commit the patch on behalf of Christian. However, since I'm a relatively new committer, I would feel more confident if one of the "old, wise committers" could double check the svn log afterwards.

Best,
Sébastien
                
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
>                 Key: MATH-692
>                 URL: https://issues.apache.org/jira/browse/MATH-692
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
>            Reporter: Christian Winter
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: Math-692_realDomain_patch1.patch
>
>
> There are some inconsistencies in the documentation and implementation of functions regarding cumulative probabilities and inverse cumulative probabilities. More precisely, '<' and '<=' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to the interface Distribution. A true inverse of the distribution function does neither exist for Distribution nor for ContinuosDistribution. Thus we need to define the inverse in terms of quantiles anyway, and this can already be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functions in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <= x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see also 1)]
> - inverseCumulativeProbability(double p):
>   returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), and 4)]
> 1) An aternative definition could be P(x0 <= X <= x1). But this requires to put the function probability(double x) or another cumulative probability function into the interface Distribution in order be able to calculate P(x0 <= X <= x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribution, because the definition there does not specify what to do if there are multiple x satisfying P(X<=x) = p.
> 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = 0} would yield the infimum of the distribution's support instead of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't change when switching to "x in Z", i.e. the result is independent of considering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <= x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) above]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira