You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by y_...@tsmc.com on 2010/01/20 10:12:47 UTC

HBase reading performance

Hi,

There are two tables table1,table2 in my cluster.

table1 with two column family, 40 qualifier, 147 regions
table1 with two column family, 34 qualifier,  77 regions

In no cache situation, 4 region sever with 2G RAM

Get table1 1285 rows , taken 133 sec, avg:0.103
Get table2 1279 rows , taken  48 sec, avg:0.037

As to the above result, taht table with more region took more time for
getting each row.
If we scale region server out to 15 machine(4 cores, 12G RAM),
will it be possible to lower the average time of table1?
Thanks

Fleming Chiu(邱宏明)
707-6128
y_823910@tsmc.com
週一無肉日吃素救地球(Meat Free Monday Taiwan)


 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 --------------------------------------------------------------------------- 




Re: HBase reading performance

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I would guess that if your test is done on a cold client, then it has
to fetch more rows from .META. to populate its region location cache
with the first table. To test that run both tests twice in the same
JVM (so one after the other) possibly even using the same HTable.

J-D

2010/1/20  <y_...@tsmc.com>:
> Hi,
>
> There are two tables table1,table2 in my cluster.
>
> table1 with two column family, 40 qualifier, 147 regions
> table1 with two column family, 34 qualifier,  77 regions
>
> In no cache situation, 4 region sever with 2G RAM
>
> Get table1 1285 rows , taken 133 sec, avg:0.103
> Get table2 1279 rows , taken  48 sec, avg:0.037
>
> As to the above result, taht table with more region took more time for
> getting each row.
> If we scale region server out to 15 machine(4 cores, 12G RAM),
> will it be possible to lower the average time of table1?
> Thanks
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>  ---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>  This email communication (and any attachments) is proprietary information
>  for the sole use of its
>  intended recipient. Any unauthorized review, use or distribution by anyone
>  other than the intended
>  recipient is strictly prohibited.  If you are not the intended recipient,
>  please notify the sender by
>  replying to this email, and then delete this email and any copies of it
>  immediately. Thank you.
>  ---------------------------------------------------------------------------
>
>
>
>

Re: HBase reading performance

Posted by stack <st...@duboce.net>.
2010/1/20 <y_...@tsmc.com>

> Hi,
>
> There are two tables table1,table2 in my cluster.
>
> table1 with two column family, 40 qualifier, 147 regions
> table1 with two column family, 34 qualifier,  77 regions
>
> In no cache situation, 4 region sever with 2G RAM
>
> Get table1 1285 rows , taken 133 sec, avg:0.103
> Get table2 1279 rows , taken  48 sec, avg:0.037
>
> More RAM makes hbase go faster, presuming some data locality.

Can you figure what the difference between the two tables is?  Is it that
the first has more store files (try doing a -lsr on the hbase directory in
hdfs).  I presume the two tables are scattered over the same 4 node
cluster?   So, its not a hw difference.

St.Ack



> As to the above result, taht table with more region took more time for
> getting each row.
> If we scale region server out to 15 machine(4 cores, 12G RAM),
> will it be possible to lower the average time of table1?
>





> Thanks
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>
>  ---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>  This email communication (and any attachments) is proprietary information
>  for the sole use of its
>  intended recipient. Any unauthorized review, use or distribution by anyone
>  other than the intended
>  recipient is strictly prohibited.  If you are not the intended recipient,
>  please notify the sender by
>  replying to this email, and then delete this email and any copies of it
>  immediately. Thank you.
>
>  ---------------------------------------------------------------------------
>
>
>
>