You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex Herbert (Jira)" <ji...@apache.org> on 2022/11/11 12:46:00 UTC

[jira] [Commented] (STATISTICS-25) T Distribution Inverse Cumulative Probability Function gives the Wrong Answer

    [ https://issues.apache.org/jira/browse/STATISTICS-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632319#comment-17632319 ] 

Alex Herbert commented on STATISTICS-25:
----------------------------------------

Currently the T-distribution CDF switches to using a normal distribution at degrees of freedom (v) = 4.5e15 (2^52). The entire distribution switches to a Normal distribution when the variance is 1; this occurs at v=3.60e16. So there is a small window where the PDF is using the computation for the T-distribution and the CDF is using the computation for the normal distribution for v in [4.5e15, 3.6e16].

The PDF and CDF computations have been improved in the underlying gamma package in Commons Numbers. Here is a re-listing of the absolute and relative difference between the two distributions with no thresholds imposed in the T-distribution:
{noformat}
a = normal(0, 1).cdf/pdf
b = t(v).cdf/pdf
error = a - b
relative error = |a - b| / max(|a|, |b|)
{noformat}
||v||z||cdf error||pdf relative error||pdf error||pdf relative error||
|1.0e+03|-15.0|-2.5051e-46|0.99999|-3.0809e-45|0.99998|
|1.0e+03|-3.0|-3.3456e-05|0.024185|-6.8776e-05|0.015281|
|1.0e+03|-2.0|-0.00013504|0.0059008|-9.4207e-05|0.0017418|
|1.0e+03|-1.0|-0.00012096|0.00076180|0.00012093|0.00049979|
|1.0e+03|-0.5|-5.5002e-05|0.00017823|0.00012650|0.00035930|
|1.0e+03|-0.125|-1.2562e-05|2.7898e-05|0.00010201|0.00025772|
|1.0e+03|-0.025|-2.4939e-06|5.0892e-06|9.9816e-05|0.00025028|
|1.0e+04|-15.0|-9.2456e-51|0.71579|-1.3501e-49|0.70940|
|1.0e+04|-3.0|-3.3261e-06|0.0024579|-6.8702e-06|0.0015478|
|1.0e+04|-2.0|-1.3498e-05|0.00059298|-9.4456e-06|0.00017492|
|1.0e+04|-1.0|-1.2098e-05|7.6249e-05|1.2098e-05|4.9998e-05|
|1.0e+04|-0.5|-5.5009e-06|1.7829e-05|1.2652e-05|3.5937e-05|
|1.0e+04|-0.125|-1.2563e-06|2.7902e-06|1.0203e-05|2.5775e-05|
|1.0e+04|-0.025|-2.4941e-07|5.0898e-07|9.9828e-06|2.5031e-05|
|1.0e+05|-15.0|-4.9917e-52|0.11970|-7.3795e-51|0.11772|
|1.0e+05|-3.0|-3.3241e-07|0.00024619|-6.8694e-07|0.00015498|
|1.0e+05|-2.0|-1.3498e-06|5.9327e-05|-9.4481e-07|1.7499e-05|
|1.0e+05|-1.0|-1.2099e-06|7.6256e-06|1.2098e-06|5.0000e-06|
|1.0e+05|-0.5|-5.5010e-07|1.7829e-06|1.2652e-06|3.5937e-06|
|1.0e+05|-0.125|-1.2563e-07|2.7902e-07|1.0203e-06|2.5775e-06|
|1.0e+05|-0.025|-2.4942e-08|5.0898e-08|9.9829e-07|2.5031e-06|
|1.0e+06|-15.0|-4.7166e-53|0.012685|-6.9801e-52|0.012463|
|1.0e+06|-3.0|-3.3239e-08|2.4623e-05|-6.8694e-08|1.5500e-05|
|1.0e+06|-2.0|-1.3498e-07|5.9330e-06|-9.4484e-08|1.7500e-06|
|1.0e+06|-1.0|-1.2099e-07|7.6257e-07|1.2099e-07|5.0000e-07|
|1.0e+06|-0.5|-5.5010e-08|1.7829e-07|1.2652e-07|3.5937e-07|
|1.0e+06|-0.125|-1.2563e-08|2.7902e-08|1.0203e-07|2.5775e-07|
|1.0e+06|-0.025|-2.4942e-09|5.0899e-09|9.9829e-08|2.5031e-07|
|1.0e+07|-15.0|-4.6902e-54|0.0012760|-6.9417e-53|0.0012535|
|1.0e+07|-3.0|-3.3239e-09|2.4623e-06|-6.8694e-09|1.5500e-06|
|1.0e+07|-2.0|-1.3498e-08|5.9330e-07|-9.4484e-09|1.7500e-07|
|1.0e+07|-1.0|-1.2099e-08|7.6257e-08|1.2099e-08|5.0000e-08|
|1.0e+07|-0.5|-5.5010e-09|1.7829e-08|1.2652e-08|3.5937e-08|
|1.0e+07|-0.125|-1.2563e-09|2.7902e-09|1.0203e-08|2.5775e-08|
|1.0e+07|-0.025|-2.4942e-10|5.0899e-10|9.9829e-09|2.5031e-08|
|1.0e+08|-15.0|-4.6876e-55|0.00012768|-6.9379e-54|0.00012543|
|1.0e+08|-3.0|-3.3239e-10|2.4623e-07|-6.8694e-10|1.5500e-07|
|1.0e+08|-2.0|-1.3498e-09|5.9330e-08|-9.4484e-10|1.7500e-08|
|1.0e+08|-1.0|-1.2099e-09|7.6257e-09|1.2099e-09|5.0000e-09|
|1.0e+08|-0.5|-5.5010e-10|1.7829e-09|1.2652e-09|3.5938e-09|
|1.0e+08|-0.125|-1.2563e-10|2.7902e-10|1.0203e-09|2.5775e-09|
|1.0e+08|-0.025|-2.4942e-11|5.0899e-11|9.9829e-10|2.5031e-09|
|1.0e+09|-15.0|-4.6873e-56|1.2768e-05|-6.9375e-55|1.2543e-05|
|1.0e+09|-3.0|-3.3239e-11|2.4623e-08|-6.8694e-11|1.5500e-08|
|1.0e+09|-2.0|-1.3498e-10|5.9330e-09|-9.4484e-11|1.7500e-09|
|1.0e+09|-1.0|-1.2099e-10|7.6257e-10|1.2099e-10|5.0000e-10|
|1.0e+09|-0.5|-5.5010e-11|1.7829e-10|1.2652e-10|3.5937e-10|
|1.0e+09|-0.125|-1.2563e-11|2.7902e-11|1.0203e-10|2.5775e-10|
|1.0e+09|-0.025|-2.4942e-12|5.0899e-12|9.9829e-11|2.5031e-10|
|1.0e+10|-15.0|-4.6873e-57|1.2768e-06|-6.9374e-56|1.2543e-06|
|1.0e+10|-3.0|-3.3239e-12|2.4623e-09|-6.8694e-12|1.5500e-09|
|1.0e+10|-2.0|-1.3498e-11|5.9330e-10|-9.4484e-12|1.7500e-10|
|1.0e+10|-1.0|-1.2099e-11|7.6257e-11|1.2099e-11|5.0000e-11|
|1.0e+10|-0.5|-5.5012e-12|1.7830e-11|1.2652e-11|3.5937e-11|
|1.0e+10|-0.125|-1.2564e-12|2.7903e-12|1.0203e-11|2.5775e-11|
|1.0e+10|-0.025|-2.4941e-13|5.0897e-13|9.9828e-12|2.5031e-11|
|1.0e+11|-15.0|-4.6873e-58|1.2769e-07|-6.9374e-57|1.2544e-07|
|1.0e+11|-3.0|-3.3239e-13|2.4623e-10|-6.8694e-13|1.5500e-10|
|1.0e+11|-2.0|-1.3498e-12|5.9330e-11|-9.4487e-13|1.7501e-11|
|1.0e+11|-1.0|-1.2098e-12|7.6256e-12|1.2097e-12|4.9995e-12|
|1.0e+11|-0.5|-5.5017e-13|1.7832e-12|1.2651e-12|3.5934e-12|
|1.0e+11|-0.125|-1.2573e-13|2.7924e-13|1.0201e-12|2.5771e-12|
|1.0e+11|-0.025|-2.4925e-14|5.0863e-14|9.9809e-13|2.5026e-12|
|1.0e+12|-15.0|-4.6873e-59|1.2768e-08|-6.9374e-58|1.2544e-08|
|1.0e+12|-3.0|-3.3239e-14|2.4624e-11|-6.8694e-14|1.5500e-11|
|1.0e+12|-2.0|-1.3495e-13|5.9319e-12|-9.4487e-14|1.7501e-12|
|1.0e+12|-1.0|-1.2096e-13|7.6240e-13|1.2099e-13|5.0000e-13|
|1.0e+12|-0.5|-5.5123e-14|1.7866e-13|1.2657e-13|3.5949e-13|
|1.0e+12|-0.125|-1.2712e-14|2.8233e-14|1.0203e-13|2.5776e-13|
|1.0e+12|-0.025|-2.4980e-15|5.0977e-15|9.9809e-14|2.5026e-13|
|1.0e+13|-15.0|-4.6873e-60|1.2768e-09|-6.9374e-59|1.2543e-09|
|1.0e+13|-3.0|-3.3235e-15|2.4620e-12|-6.8686e-15|1.5498e-12|
|1.0e+13|-2.0|-1.3475e-14|5.9232e-13|-9.4508e-15|1.7504e-13|
|1.0e+13|-1.0|-1.2074e-14|7.6100e-14|1.2074e-14|4.9897e-14|
|1.0e+13|-0.5|-5.5511e-15|1.7992e-14|1.2657e-14|3.5949e-14|
|1.0e+13|-0.125|-1.3323e-15|2.9589e-15|1.0214e-14|2.5804e-14|
|1.0e+13|-0.025|-2.7756e-16|5.6641e-16|9.9365e-15|2.4915e-14|
|1.0e+14|-15.0|-4.6864e-61|1.2766e-10|-6.9375e-60|1.2544e-10|
|1.0e+14|-3.0|-3.3133e-16|2.4545e-13|-6.9042e-16|1.5579e-13|
|1.0e+14|-2.0|-1.3357e-15|5.8713e-14|-9.4369e-16|1.7479e-14|
|1.0e+14|-1.0|-1.1935e-15|7.5225e-15|1.1935e-15|4.9324e-15|
|1.0e+14|-0.5|-6.6613e-16|2.1590e-15|1.2768e-15|3.6265e-15|
|1.0e+14|-0.125|-2.2204e-16|4.9315e-16|1.0547e-15|2.6645e-15|
|1.0e+14|-0.025|-5.5511e-17|1.1328e-16|9.4369e-16|2.3662e-15|
|1.0e+15|-15.0|-4.6685e-62|1.2717e-11|-6.9400e-61|1.2548e-11|
|1.0e+15|-3.0|-3.0791e-17|2.2810e-14|-7.1124e-17|1.6048e-14|
|1.0e+15|-2.0|-1.3184e-16|5.7951e-15|-9.0206e-17|1.6708e-15|
|1.0e+15|-1.0|-8.3267e-17|5.2483e-16|5.5511e-17|2.2941e-16|
|1.0e+15|-0.5|-2.2204e-16|7.1967e-16|5.5511e-17|1.5767e-16|
|1.0e+15|-0.125|-1.1102e-16|2.4657e-16|0.0000|0.0000|
|1.0e+15|-0.025|0.0000|0.0000|0.0000|0.0000|
|1.0e+16|-15.0|-4.4481e-63|1.2117e-12|-6.9954e-62|1.2648e-12|
|1.0e+16|-3.0|-3.9031e-18|2.8914e-15|-8.6736e-18|1.9571e-15|
|1.0e+16|-2.0|-1.3878e-17|6.1001e-16|-2.7756e-17|5.1408e-16|
|1.0e+16|-1.0|-2.7756e-17|1.7494e-16|0.0000|0.0000|
|1.0e+16|-0.5|-1.1102e-16|3.5983e-16|0.0000|0.0000|
|1.0e+16|-0.125|-1.1102e-16|2.4657e-16|0.0000|0.0000|
|1.0e+16|-0.025|0.0000|0.0000|0.0000|0.0000|
|1.0e+17|-15.0|-4.1721e-64|1.1365e-13|-7.0742e-63|1.2791e-13|
|1.0e+17|-3.0|-2.1684e-19|1.6063e-16|0.0000|0.0000|
|1.0e+17|-2.0|2.0817e-17|9.1501e-16|0.0000|0.0000|
|1.0e+17|-1.0|2.7756e-17|1.7494e-16|0.0000|0.0000|
|1.0e+17|-0.5|-1.1102e-16|3.5983e-16|0.0000|0.0000|
|1.0e+17|-0.125|-1.1102e-16|2.4657e-16|0.0000|0.0000|
|1.0e+17|-0.025|0.0000|0.0000|0.0000|0.0000|
|1.0e+18|-15.0|1.3887e-64|3.7830e-14|-7.9763e-64|1.4422e-14|
|1.0e+18|-3.0|0.0000|0.0000|-8.6736e-19|1.9571e-16|
|1.0e+18|-2.0|1.7347e-17|7.6251e-16|-1.3878e-17|2.5704e-16|
|1.0e+18|-1.0|5.5511e-17|3.4989e-16|-2.7756e-17|1.1471e-16|
|1.0e+18|-0.5|-1.1102e-16|3.5983e-16|-5.5511e-17|1.5767e-16|
|1.0e+18|-0.125|-1.1102e-16|2.4657e-16|-5.5511e-17|1.4024e-16|
|1.0e+18|-0.025|0.0000|0.0000|-1.1102e-16|2.7838e-16|
|1.0e+19|-15.0|1.3947e-64|3.7992e-14|-9.4956e-66|1.7169e-16|
|1.0e+19|-3.0|0.0000|0.0000|-8.6736e-19|1.9571e-16|
|1.0e+19|-2.0|2.0817e-17|9.1501e-16|-1.3878e-17|2.5704e-16|
|1.0e+19|-1.0|5.5511e-17|3.4989e-16|-2.7756e-17|1.1471e-16|
|1.0e+19|-0.5|-1.1102e-16|3.5983e-16|-5.5511e-17|1.5767e-16|
|1.0e+19|-0.125|-1.1102e-16|2.4657e-16|-5.5511e-17|1.4024e-16|
|1.0e+19|-0.025|0.0000|0.0000|-1.1102e-16|2.7838e-16|
|1.0e+20|-15.0|1.3947e-64|3.7992e-14|0.0000|0.0000|
|1.0e+20|-3.0|0.0000|0.0000|0.0000|0.0000|
|1.0e+20|-2.0|2.7756e-17|1.2200e-15|0.0000|0.0000|
|1.0e+20|-1.0|8.3267e-17|5.2483e-16|0.0000|0.0000|
|1.0e+20|-0.5|-1.1102e-16|3.5983e-16|0.0000|0.0000|
|1.0e+20|-0.125|-1.1102e-16|2.4657e-16|0.0000|0.0000|
|1.0e+20|-0.025|0.0000|0.0000|0.0000|0.0000|

The CDF computation is now stable as the degrees of freedom increases. The computation does compute a different value in the tails (x=-15) than the normal distribution at degrees of freedom above 3.6e16. In the centre (|x|<3) the difference is close to machine epsilon.

I would suggest removing the small window where the CDF switches to the normal distribution before the entire distribution switches to the normal distribution.

> T Distribution Inverse Cumulative Probability Function gives the Wrong Answer
> -----------------------------------------------------------------------------
>
>                 Key: STATISTICS-25
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-25
>             Project: Commons Statistics
>          Issue Type: Bug
>          Components: distribution
>            Reporter: Andreas Stefik
>            Priority: Major
>
> Hi There,
> Given code like this:
>  
> import org.apache.commons.math3.analysis.UnivariateFunction;
> import org.apache.commons.math3.analysis.solvers.BrentSolver;
> import org.apache.commons.math3.distribution.TDistribution;
> public class Main {
>  public static void main(String[] args) {
>  double df = 1E38;
>  double t = 0.975;
>  TDistribution dist = new TDistribution(df);
>  
>  double prob = dist.inverseCumulativeProbability(1.0 - t);
>  
>  System.out.println("Prob: " + prob);
>  }
> }
>  
> It is possible I am misunderstanding, but that seems equivalent to:
>  
> scipy.stats.t.cdf(1.0 - 0.975, 1e38)
>  
> In Python. They give different answers. Python gives 0.509972518193, which seems correct, whereas Apache Commons gives  Prob: -6.462184036284304E-10. That's a huge difference.
> My hunch is that as you get closer to infinity it begins to fail, but I haven't checked carefully. For calls with much smaller degrees of freedom, you get answers that are basically the same as Python or online calculators.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)