You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Alex Newman (JIRA)" <ji...@apache.org> on 2013/08/21 10:32:51 UTC
[jira] [Commented] (HBASE-9286) ageOfLastShippedOp replication
metric doesn't update if the slave regionserver is stalled
[ https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745871#comment-13745871 ]
Alex Newman commented on HBASE-9286:
------------------------------------
or does it make more sense to focus on 0.95.0 for now?
> ageOfLastShippedOp replication metric doesn't update if the slave regionserver is stalled
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-9286
> URL: https://issues.apache.org/jira/browse/HBASE-9286
> Project: HBase
> Issue Type: Bug
> Reporter: Alex Newman
> Assignee: Alex Newman
> Attachments: 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch
>
>
> In replicationmanager
> HRegionInterface rrs = getRS();
> rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray, currentNbEntries));
> ....
> this.metrics.setAgeOfLastShippedOp(
> this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
> break;
> which makes sense, but is wrong. The problem is that rrs.replicateLogEntries will block for a very long time if the slave server is suspended or unavailable but not down.
> However this is easy to fix. We just need to call refreshAgeOfLastShippedOp();
> on a regular basis, in a different thread. I've attached a patch which fixed this for cdh4. I can make one for trunk and the like as well if you need me to do but it's a small change.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira