You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Adam Stelmaszczyk (JIRA)" <ji...@apache.org> on 2014/11/13 23:54:33 UTC

[jira] [Updated] (MATH-1169) Faster Percentile

     [ https://issues.apache.org/jira/browse/MATH-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam Stelmaszczyk updated MATH-1169:
------------------------------------
    Attachment: PercentileFloydRivest.java
                Main.java
                FloydRivest.java

Main class should be run, it needs Caliper JAR: https://code.google.com/p/caliper/downloads/detail?name=caliper-1.0-beta-1-all.jar

FloydRivest contatins implementation basing on http://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm pseudocode.

PercentileFloydRivest.java is Percentile.java with modified evaluate(), so it calls FloydRivest.select() instead of typical select(). pivotsHeap was removed for simplicity (maybe it can improve efficiency even more).

> Faster Percentile
> -----------------
>
>                 Key: MATH-1169
>                 URL: https://issues.apache.org/jira/browse/MATH-1169
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Adam Stelmaszczyk
>            Priority: Minor
>         Attachments: FloydRivest.java, Main.java, PercentileFloydRivest.java
>
>
> Percentile class is now using Hoare selection algorithm (aka Quickselect) with median of 3 for choosing a pivot and caching pivots. Quickselect expected runtime is about 3.4n + o(n).
> The constant can be improved to 1.5n by a bit complicated pivot strategy, yielding the Floyd–Rivest algorithm, which has average complexity of 1.5n + O(n^0.5) for median, with other k being faster.
> I implemented Floyd–Rivest algorithm without caching pivots following wikipedia and benchmarked it with Caliper (https://code.google.com/p/caliper/).
> Array size - runtime in seconds for Floyd-Rivest - current Percentile - % faster
> 100000 - 6.907 - 7.083 - 2.5%
> 1000000 - 70.3 - 75.4 - 6.8%
> 10000000 - 708 - 868 - 18.4% (https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)