You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by bh...@apache.org on 2020/06/12 21:59:27 UTC
[hbase] branch master updated: HBASE-24535: Tweak the master
registry docs for branch-2 (#1890)
This is an automated email from the ASF dual-hosted git repository.
bharathv pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hbase.git
The following commit(s) were added to refs/heads/master by this push:
new fd5002d HBASE-24535: Tweak the master registry docs for branch-2 (#1890)
fd5002d is described below
commit fd5002d0da39c66727e5bb20ab71e8c5222e7407
Author: Bharath Vissapragada <bh...@apache.org>
AuthorDate: Fri Jun 12 14:59:04 2020 -0700
HBASE-24535: Tweak the master registry docs for branch-2 (#1890)
Updated to include changes in HBASE-24265 and some rewording
to make it version agnostic.
Signed-off-by: Nick Dimiduk <nd...@apache.org>
---
src/main/asciidoc/_chapters/architecture.adoc | 54 ++++++++++++++++++---------
1 file changed, 37 insertions(+), 17 deletions(-)
diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc
index 80e20a9..218e674 100644
--- a/src/main/asciidoc/_chapters/architecture.adoc
+++ b/src/main/asciidoc/_chapters/architecture.adoc
@@ -261,8 +261,8 @@ For region name, we only accept `byte[]` as the parameter type and it may be a f
Information on non-Java clients and custom protocols is covered in <<external_apis>>
[[client.masterregistry]]
-=== Master registry (new as of release 3.0.0)
+=== Master Registry (new as of 2.3.0)
Client internally works with a _connection registry_ to fetch the metadata needed by connections.
This connection registry implementation is responsible for fetching the following metadata.
@@ -271,18 +271,18 @@ This connection registry implementation is responsible for fetching the followin
* Cluster ID (unique to this cluster)
This information is needed as a part of various client operations like connection set up, scans,
-gets etc. Up until releases 2.x.y, the default connection registry is based on ZooKeeper as the
-source of truth and the the clients fetched the metadata from zookeeper znodes. As of release 3.0.0,
-the default implementation for connection registry has been switched to a master based
-implementation. With this change, the clients now fetch the required metadata from master RPC end
-points directly. This change was done for the following reasons.
+gets, etc. Traditionally, the connection registry implementation has been based on ZooKeeper as the
+source of truth and clients fetched the metadata directly from the ZooKeeper quorum. HBase 2.3.0
+introduces a new connection registry implementation based on direct communication with the Masters.
+With this implementation, clients now fetch required metadata via master RPC end points instead of
+maintaining connections to ZooKeeper. This change was done for the following reasons.
* Reduce load on ZooKeeper since that is critical for cluster operation.
* Holistic client timeout and retry configurations since the new registry brings all the client
operations under HBase rpc framework.
* Remove the ZooKeeper client dependency on HBase client library.
-This means that
+This means:
* At least a single active or stand by master is needed for cluster connection setup. Refer to
<<master.runtime>> for more details.
@@ -293,22 +293,42 @@ HMasters instead of ZooKeeper ensemble`
To reduce hot-spotting on a single master, all the masters (active & stand-by) expose the needed
service to fetch the connection metadata. This lets the client connect to any master (not just active).
+Both ZooKeeper- and Master-based connection registry implementations are available in 2.3+. For
+2.3 and earlier, the ZooKeeper-based implementation remains the default configuration.
+The Master-based implementation becomes the default in 3.0.0.
-==== RPC hedging
+Change the connection registry implementation by updating the value configured for
+`hbase.client.registry.impl`. To explicitly enable the ZooKeeper-based registry, use
-This feature also implements an new RPC channel that can hedge requests to multiple masters. This
-lets the client make the same request to multiple servers and which ever responds first is returned
-back to the client and the other other in-flight requests are canceled. This improves the
-performance, especially when a subset of servers are under load. The hedging fan out size is
-configurable, meaning the number of requests that are hedged in a single attempt, using the
-configuration key _hbase.rpc.hedged.fanout_ in the client configuration. It defaults to 2. With this
-default, the RPCs are tried in batches of 2. The hedging policy is still primitive and does not
+[source, xml]
+<property>
+ <name>hbase.client.registry.impl</name>
+ <value>org.apache.hadoop.hbase.client.ZKConnectionRegistry</value>
+ </property>
+
+To explicitly enable the Master-based registry, use
+
+[source, xml]
+<property>
+ <name>hbase.client.registry.impl</name>
+ <value>org.apache.hadoop.hbase.client.MasterRegistry</value>
+ </property>
+
+==== MasterRegistry RPC hedging
+
+MasterRegistry implements hedging of connection registry RPCs across active and stand-by masters.
+This lets the client make the same request to multiple servers and which ever responds first is
+returned back to the client immediately. This improves performance, especially when a subset of
+servers are under load. The hedging fan out size is configurable, meaning the number of requests
+that are hedged in a single attempt, using the configuration key
+_hbase.client.master_registry.hedged.fanout_ in the client configuration. It defaults to 2. With
+this default, the RPCs are tried in batches of 2. The hedging policy is still primitive and does not
adapt to any sort of live rpc performance metrics.
==== Additional Notes
-* Clients hedge the requests in a randomized order to avoid hot-spotting a single server.
-* Cluster internal connections (master<->regionservers) still use ZooKeeper based connection
+* Clients hedge the requests in a randomized order to avoid hot-spotting a single master.
+* Cluster internal connections (masters <-> regionservers) still use ZooKeeper based connection
registry.
* Cluster internal state is still tracked in Zookeeper, hence ZK availability requirements are same
as before.