You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Sandeep Pal (Jira)" <ji...@apache.org> on 2020/07/10 20:22:00 UTC

[jira] [Created] (HBASE-24716) Do the error handling for replication admin failures

Sandeep Pal created HBASE-24716:
-----------------------------------

             Summary: Do the error handling for replication admin failures
                 Key: HBASE-24716
                 URL: https://issues.apache.org/jira/browse/HBASE-24716
             Project: HBase
          Issue Type: Improvement
          Components: Replication
            Reporter: Sandeep Pal
            Assignee: Sandeep Pal


[listPeerConfigs()|[https://git.soma.salesforce.com/bigdata-packaging/hbase/blob/1.6.0-sfdc-1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java#L295]] for getting the list of peers along with their configuration is not a reliable API.

It is not very robust to errors, logs FATAL and swallows the [exceptions|[https://github.com/apache/hbase/blob/branch-1/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeersZKImpl.java#L254]] 

 

Snippet:

catch (KeeperException e) {
 this.abortable.abort("Cannot get the list of peers ", e);
} catch (ReplicationException e) {
 this.abortable.abort("Cannot get the list of peers ", e);
}
return peers;

 


The abortable (connection in this case) also doesn't abort the region server and just logs. This makes upstream believe that there is nothing wrong and proceed without any action which is not good.

 

 
{code:java}
2020-07-07 23:11:37,857 FATAL [14774961,peer_id] client.ConnectionManager$HConnectionImplementation - Cannot get the list of peersorg.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/replication/peersat org.apache.zookeeper.KeeperException.create(KeeperException.java:130)at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1549)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:312)at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenNoWatch(ZKUtil.java:513)at org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.getAllPeerConfigs(ReplicationPeersZKImpl.java:249)at org.apache.hadoop.hbase.client.replication.ReplicationAdmin.listPeerConfigs(ReplicationAdmin.java:332)
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)