You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Gary Helmling (JIRA)" <ji...@apache.org> on 2010/11/02 01:38:24 UTC

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927237#action_12927237 ] 

Gary Helmling commented on HBASE-1956:
--------------------------------------

I believe the reason for incrementing counters in HFile and HLog and then just incrementing with totals for the polling period in RegionServerMetrics.doUpdates() was to avoid contention on the synchronized MetricsTimeVaryingRate.inc() call in the critical read and write paths.  Seems like it would be worth doing some profiling on these in HBASE-3129 to see what the cost would actually be to call MetricsTimeVaryingRate.inc() per operation instead.

But this definitely makes the FS metrics less useful as a result and squashes the real outliers.

In any case, the bug that omitted fsSyncLatency from the min/max value reset was fixed with HBASE-3102, and HBASE-3129 addresses improving the min/max values reported, so I think we can close this issue again.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.