You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Wang (JIRA)" <ji...@apache.org> on 2012/06/27 03:07:44 UTC

[jira] [Updated] (HBASE-6261) Better approximate high-percentile percentile latency metrics

     [ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Wang updated HBASE-6261:
-------------------------------

    Attachment: Latencyestimation.pdf

I've written up a comparison of what I think are all the available options. It really just comes down to a couple questions:

- Do we care about bounded error?
- Do we want sliding windows (more mem), or are okay just snapshotting and starting anew every interval?
- Do we care about strictly bounded memory usage, or is O(few MBs) good enough?

I'm hoping that we want bounded error, are okay snapshotting, and are okay with O(few MBs). I've implemented the algo for this case and am testing it out to make sure it meets the performance requirements.
                
> Better approximate high-percentile percentile latency metrics
> -------------------------------------------------------------
>
>                 Key: HBASE-6261
>                 URL: https://issues.apache.org/jira/browse/HBASE-6261
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Wang
>              Labels: metrics
>         Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira