You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "zhuobin zheng (Jira)" <ji...@apache.org> on 2021/01/15 04:48:00 UTC

[jira] [Commented] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

    [ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265688#comment-17265688 ] 

zhuobin zheng commented on HBASE-25510:
---------------------------------------

Add MR links and add benchmark attachments.

Benchmark Result Explanation:
 # testStr means called  TableName.valueOf(String name)
 # tastBB means called  TableName valueOf(ByteBuffer namespace, ByteBuffer qualifier)
 # The number after testStr and testBB is TableNames num. like 1000 means 1000 different tableName.

Origin:

 
{code:java}
// code placeholder
Benchmark                        Mode  Cnt       Score       Error   Units
TestTableNameJMH.testBB1        thrpt   10   36132.014 ±  1628.381  ops/ms
TestTableNameJMH.testBB10       thrpt   10   14056.243 ±   638.379  ops/ms
TestTableNameJMH.testBB100      thrpt   10    2215.671 ±    49.759  ops/ms
TestTableNameJMH.testBB1000     thrpt   10     224.802 ±     4.253  ops/ms
TestTableNameJMH.testBB10000    thrpt   10      22.476 ±     4.729  ops/ms
TestTableNameJMH.testBB100000   thrpt   10       1.931 ±     0.578  ops/ms
TestTableNameJMH.testStr1       thrpt   10  147905.572 ± 20777.681  ops/ms
TestTableNameJMH.testStr10      thrpt   10   44597.261 ±  6346.679  ops/ms
TestTableNameJMH.testStr100     thrpt   10    5464.205 ±  1442.556  ops/ms
TestTableNameJMH.testStr1000    thrpt   10     360.183 ±   127.615  ops/ms
TestTableNameJMH.testStr10000   thrpt   10      45.338 ±     3.545  ops/ms
TestTableNameJMH.testStr100000  thrpt   10       1.927 ±     0.831  ops/ms


{code}
After Optimize:
{code:java}
// code placeholder
Benchmark                        Mode  Cnt       Score       Error   Units
TestTableNameJMH.testBB1        thrpt   10   21585.408 ±  2519.495  ops/ms
TestTableNameJMH.testBB10       thrpt   10   23474.278 ±   175.576  ops/ms
TestTableNameJMH.testBB100      thrpt   10   20600.624 ±  4035.725  ops/ms
TestTableNameJMH.testBB1000     thrpt   10   18349.054 ±   313.875  ops/ms
TestTableNameJMH.testBB10000    thrpt   10   15981.688 ±   836.096  ops/ms
TestTableNameJMH.testBB100000   thrpt   10   14276.288 ±   201.779  ops/ms
TestTableNameJMH.testStr1       thrpt   10  239837.152 ± 10767.013  ops/ms
TestTableNameJMH.testStr10      thrpt    4  236578.812 ± 57640.770  ops/ms
TestTableNameJMH.testStr100     thrpt    5  227980.174 ± 44822.292  ops/ms
TestTableNameJMH.testStr1000    thrpt   10  131935.073 ±  4495.644  ops/ms
TestTableNameJMH.testStr10000   thrpt   10   81979.448 ±  3230.575  ops/ms
TestTableNameJMH.testStr100000  thrpt   10   61054.516 ± 10613.181  ops/ms
{code}
 

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the number of tables in the cluster is greater than dozens
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-25510
>                 URL: https://issues.apache.org/jira/browse/HBASE-25510
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, Replication
>    Affects Versions: 1.2.12, 1.4.13, 2.4.1
>            Reporter: zhuobin zheng
>            Priority: Major
>         Attachments: optimiz_benchmark, origin_benchmark
>
>
> Now, TableName.valueOf will try to find TableName Object in cache linearly(code show as below). So it is too slow when we has  thousands of tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), bns)) {
>     return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN table). 
>  We found that in the following two cases, the TableName.valueOf method will severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init will call TableName.valueOf.  It will cost some time if TableName.valueOf is slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)