You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2018/01/25 00:21:00 UTC
[jira] [Updated] (HBASE-18549) Unclaimed replication queues can go
undetected
[ https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell updated HBASE-18549:
-----------------------------------
Fix Version/s: (was: 1.4.1)
1.4.2
> Unclaimed replication queues can go undetected
> ----------------------------------------------
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Ashu Pachauri
> Priority: Critical
> Fix For: 1.3.2, 1.5.0, 1.4.2
>
>
> We have come across this situation multiple times where a zookeeper issues can cause NodeFailoverWorker to fail picking up replication queue for a dead region server silently. One example is when the znode size for a particular queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. We need to have a metric for number of unclaimed replication queues. This will help in mitigating the problem through alerting on the metric and identifying underlying issues.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)