You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "star (Jira)" <ji...@apache.org> on 2020/05/01 11:47:00 UTC

[jira] [Created] (HBASE-24298) Reduce cpu cost of locating region especially in batch mode.

star created HBASE-24298:
----------------------------

             Summary: Reduce cpu cost of locating region especially in batch mode.
                 Key: HBASE-24298
                 URL: https://issues.apache.org/jira/browse/HBASE-24298
             Project: HBase
          Issue Type: Bug
            Reporter: star


Binary search is used to speedup the process of locating region. It is already fast enough, while cpu of HBASE client becomes the bottleneck when doing TCSB benchmark. We can make the process of locating region faster to reduce cpu load in some special cases , which however it is our common case in production environment.  It is the case: 

        1. Predefined splits in uniform distribution.
 
        2. Load data in batch mode.

The optimization is very simple, just to contract range of binary search. Initially,  record all startIndex and endIndex of first or two byte of keys. When a region key comes, find the contracted startIndex and endIndex of the key. Then return to normal binary search process with the specified startIndex and endIndex. 

When we can ideally reduce cpu to 1/8 with 1 byte or 1/16 with 2 bytes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)