You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/11/05 10:16:32 UTC

[jira] Created: (HBASE-1956) Export HDFS read and write latency as a metric

Export HDFS read and write latency as a metric
----------------------------------------------

                 Key: HBASE-1956
                 URL: https://issues.apache.org/jira/browse/HBASE-1956
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: Andrew Purtell
            Priority: Minor
             Fix For: 0.20.2, 0.21.0


HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774184#action_12774184 ] 

Andrew Purtell commented on HBASE-1956:
---------------------------------------

Yes, metrics from the HBase DFS client. No dependencies on HDFS timetable or release schedule, please.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828852#action_12828852 ] 

ryan rawson commented on HBASE-1956:
------------------------------------

I'm using the file context and I don't seem to get the count reset. So the average is over all time, not on a heartbeat interval (10s in the config).

Is this expected?

According to docs on the 'net about volatile, it is not atomic, so maybe im always seeing a race condition and the counter is never reset.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774066#action_12774066 ] 

stack commented on HBASE-1956:
------------------------------

@Eli: My guess is that this would be an hbase metric.  We measure the time an append to our WAL file takes.  If > a second, we recently started logging it as the canary for pending hdfs issues.  I'm guessing Andrew is suggesting we add this to our list of hbase metrics.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795538#action_12795538 ] 

stack commented on HBASE-1956:
------------------------------

Ok.  +1.  Thanks Andy.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1956:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed to trunk and 0.20 branch.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774010#action_12774010 ] 

Eli Collins commented on HBASE-1956:
------------------------------------

+1 

By the way I've found monitoring await in Ganglia useful, and have seen over 1000ms average waits on a busy large cluster, having both the host await and a similar dfsmetric would be useful for debugging.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774059#action_12774059 ] 

Eli Collins commented on HBASE-1956:
------------------------------------

Forgot to ask, are you proposing HDFS expose a write latency as a dfs metric, that's what I was assuming, if so we should re-categorize this under HDFS or file a separate jira. 

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1956:
-------------------------

    Fix Version/s:     (was: 0.20.2)
                       (was: 0.21.0)
                   0.20.3

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1956:
-------------------------


Moving out of 0.20.2.  Its a new feature so shouldn't really get in way of release.  I'll put it in new 0.20.3 bucket.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1956:
----------------------------------

    Attachment: HBASE-1956.patch

I tried a couple of options. The attached patch is the least invasive. Will commit unless voted down.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795536#action_12795536 ] 

Andrew Purtell commented on HBASE-1956:
---------------------------------------

bq. Do you think we need to do metrics on hfile as well as hlog? 

It depends on what kind of coverage we want these metrics to have.

bq. Would hlog writes be sufficient canary-in-the-mine indicator that hdfs is slow?

Yes, for writes.

bq. I'd be just a little worried about all those System.currentTimeMillis calls.

I don't disagree. 

On HFile read path, only the block read is timed. 

As to the overhead of the call itself, in the OpenJDK source this is a call to gettimeofday through the JNI. On modern Linux systems, this is a read out of a page mapped into high address space as a direct memory access, not a syscall. 

On the write side, could only time the writes of an end of block marker?

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reopened HBASE-1956:
-----------------------------------


Average over all time is not correct

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774539#action_12774539 ] 

Eli Collins commented on HBASE-1956:
------------------------------------

I double checked and there's already a dfs metric (writeBlockOp_avg_time) that you could use to monitor write latency.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reassigned HBASE-1956:
-------------------------------------

    Assignee: Andrew Purtell

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795411#action_12795411 ] 

Jean-Daniel Cryans commented on HBASE-1956:
-------------------------------------------

Are you working on this for 0.20.3 Andrew?

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795531#action_12795531 ] 

stack commented on HBASE-1956:
------------------------------

Looks fine Andrew.  I'd be just a little worried about all those System.currentTimeMillis calls.  They add up and are inline with the write.

Do you think we need to do metrics on hfile as well as hlog?  Would hlog writes be sufficient canary-in-the-mine indicator that hdfs is slow?

Good stuff.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774541#action_12774541 ] 

Andrew Purtell commented on HBASE-1956:
---------------------------------------

@Eli: Yes but that is a datanode server side metric. This is not exactly the latency experienced by the client.

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1956:
----------------------------------

    Attachment: HBASE-1956.patch

Updated patch is about as minimal as it gets. Times HLog sync and write times. Times HFile block reads and only end of block writes. 

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch, HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795423#action_12795423 ] 

Andrew Purtell commented on HBASE-1956:
---------------------------------------

Yes. I haven't gotten to it yet. Will work on it today at some point...

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1956:
----------------------------------

    Fix Version/s: 0.21.0
           Status: Patch Available  (was: Open)

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-1956.patch
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774030#action_12774030 ] 

Jean-Daniel Cryans commented on HBASE-1956:
-------------------------------------------

+1

> Export HDFS read and write latency as a metric
> ----------------------------------------------
>
>                 Key: HBASE-1956
>                 URL: https://issues.apache.org/jira/browse/HBASE-1956
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.2, 0.21.0
>
>
> HDFS write latency spikes especially are an indicator of general cluster overloading. We see this where the WAL writer complains about writes taking > 1 second, sometimes > 4, etc.  If for example the average write latency over the monitoring period is exported as a metric, then this can feed into alerting for or automatic provisioning of additional cluster hardware. While we're at it, export read side metrics as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.