You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Sujit P (JIRA)" <ji...@apache.org> on 2018/09/17 11:47:00 UTC
[jira] [Created] (HBASE-21201) Support to run VerifyReplication MR
tool without peerid
Sujit P created HBASE-21201:
-------------------------------
Summary: Support to run VerifyReplication MR tool without peerid
Key: HBASE-21201
URL: https://issues.apache.org/jira/browse/HBASE-21201
Project: HBase
Issue Type: Brainstorming
Components: hbase-operator-tools
Affects Versions: 3.0.0, 2.2.0
Reporter: Sujit P
In some use cases, hbase clients writes to separate clusters(probably different datacenters) tables for redundancy. As an administrator/application architect, I would like to find out if both cluster tables are in the same state (cell by cell). One of the tools that is readily available to use is VerifyRep which is part of replication.
However, it requires peerId to be setup on atleast of the involved cluster. PeerId is unnecessary in this use-case scenario and possibly cause unintended consequences as the clusters aren't really replication peers neither do We prefer them to be.
Looking at the code:
Tool attempts to get only the clusterKey which is essentially ZooKeeper quorum url
{code:java}
//VerifyReplication.java
private static Pair<ReplicationPeerConfig, Configuration> getPeerQuorumConfig(final Configuration conf, String peerId)
.
.
return Pair.newPair(peerConfig,
ReplicationUtils.getPeerClusterConfiguration(peerConfig, conf));
//ReplicationUtils.java
public static Configuration getPeerClusterConfiguration(ReplicationPeerConfig peerConfig, Configuration baseConf) throws ReplicationException {
Configuration otherConf;
try {
otherConf = HBaseConfiguration.createClusterConf(baseConf, peerConfig.getClusterKey());{code}
So I would like to propose to update the tool to pass the remote cluster ZkQuorum as an argument (ex. --peerQuorumAddress clusterBzk1,clusterBzk2,clusterBzk3:2181/hbase-secure ) and use it effectively without dependence on replication peerId, similar to peerFSAddress. The are certain advantages in doing so as follows:
* Reduce the development/maintenance of separate tool for above scenario
* Allow the tool to be more useful for other scenarios as well such as
** validating backups in remote cluster HBASE-19106
** compare cloned tableA and original tableA in same/remote cluster incase of user error before restoring snapshot to original table to find the records that need to be added/invalid/missing etc
** Allow backup operators who are non-Hbase admins(who shouldn't be adding the peerId) to run the tool, since currently only Hbase superuser can add a peerId for reasons discussed in HBASE-21163.
Please post your comments
Thanks
cc: [~clayb], [~brfrn169] , [~vrodionov] , [~rashidaligee]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)