You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mohammad Arshad (Jira)" <ji...@apache.org> on 2020/08/03 05:43:00 UTC
[jira] [Updated] (HBASE-24211) Create table is slow in large
cluster when AccessController is enabled.
[ https://issues.apache.org/jira/browse/HBASE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mohammad Arshad updated HBASE-24211:
------------------------------------
Component/s: Performance
> Create table is slow in large cluster when AccessController is enabled.
> -----------------------------------------------------------------------
>
> Key: HBASE-24211
> URL: https://issues.apache.org/jira/browse/HBASE-24211
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Affects Versions: 1.3.6, master, 2.2.4
> Reporter: Mohammad Arshad
> Assignee: Mohammad Arshad
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0
>
>
> *Problem:*
> In HBase 1.3.x large, performance test, cluster (100 RS, 60k tables, 600k regions) a simple table creation takes around 150 seconds. The time taken varies but still takes lot of time.
> *Analysis:*
> 1. When HBase creates a table , it calls AssignmentManager#assign(final ServerName destination, final List<HRegionInfo> regions)
> In AssignmentManager#assign,it calls asyncSetOfflineInZooKeeper(state, cb, destination), and waits in below code loop for 2 minutes.
> {code:java}
> if (useZKForAssignment) {
> // Wait until all unassigned nodes have been put up and watchers set.
> int total = states.size();
> for (int oldCounter = 0; !server.isStopped();) {
> int count = counter.get();
> if (oldCounter != count) {
> LOG.debug(destination.toString() + " unassigned znodes=" + count +
> " of total=" + total + "; oldCounter=" + oldCounter);
> oldCounter = count;
> }
> if (count >= total) break;
> Thread.sleep(5);
> }
> }
> {code}
> 2. asyncSetOfflineInZooKeeper creates a znode under /hbase/region-in-transition/ and calls exist to ensure that znode is created. This is simple operation should not take much time. Then where the time it taken!!!
> 3. ZooKeeper client API process watcher notification and async API response through a queue one by one.
> If there is a delay in any watcher/response processing by the client, in this case HBase, all other response processing is delayed. Then it appears as if API call has taken more time.
> Same thing happen in this issue.
> Watcher processing for znode creation under /hbase/acl took most of the time and delayed /hbase/region-in-transition/region znode creation processing. This is why wait in loop was too long.
> 4. Watcher processing for znode creation under hbase/acl/ calls ZKPermissionWatcher#nodeChildrenChanged, which internally calls ZKUtil.getChildDataAndWatchForNewChildren
> *which calls ZooKeeper's getData API, in this use case, 60k times which takes most of the time.*
> *Solutions:*
> Move getChildDataAndWatchForNewChildren call into the async code block in ZKPermissionWatcher#nodeChildrenChanged.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)