You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Christopher Nix (JIRA)" <ji...@apache.org> on 2011/06/09 01:07:58 UTC

[jira] [Commented] (MATH-582) Percentile does not work as described in API

    [ https://issues.apache.org/jira/browse/MATH-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046272#comment-13046272 ] 

Christopher Nix commented on MATH-582:
--------------------------------------

I believe the implementation of percentiles within the library is in accordance with the NIST definition of percentiles.  To address your examples separately:

1.  What is missing from the API in the description of the implementation is "If pos < 1 then return the smallest element in the array".  As such, the value of 0.0 returned in your first example is indeed correct for this implementation.

2.  In this definition of percentiles, the value of pos is a position in the array to be interpolated, but with array indices starting with 1. So with pos = 1.25, the value returned is correctly a quarter between the 1st and 2nd array values.

Percentiles do not meet intuition well when working with small datasets.  Other definitions, for example one with pos = 1+p*(n-1)/100 (like in MS Excel), may meet your requirement better in the above datasets, but not so well with medium ones.  With large datasets, the two definitions converge.

Hope this helps,

Chris N

> Percentile does not work as described in API
> --------------------------------------------
>
>                 Key: MATH-582
>                 URL: https://issues.apache.org/jira/browse/MATH-582
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.2
>            Reporter: Andre Herbst
>
> example call:
> StatUtils.percentile(new double[]{0d, 1d}, 25)   returns 0.0
> The API says that there is a position being computed:  p*(n+1)/100 -> we have p=25 and n=2
> I would expect position 0.75 as result. Next step according to the API is: interpolation between both values at floor(0.25) and at ceil(0.25). Those values are 0d and 1d ... so lower + d * (upper - lower) should give 0d + 0.25*(1d - 0d) = 0.25
> But the above call returns 0 as result. This does not make sense to me.
> another example where I think the result is not correct:
> StatUtils.percentile(new double[]{0d, 1d, 1d, 1d}, 25)   returns 0.25
> we have pos = 25*5/100 = 1.25  ... so d = 0.25
> values at position floor(1.25) and ceil(1.25) are 1d and 1d. How comes that the result is not between 1d?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira