You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by "ac@hsk.hk" <ac...@hsk.hk> on 2012/11/22 13:49:12 UTC

HBASE Benchmarking

Hi,

I tried to write one million records into the HBASE cluster with 5 nodes (Hbase 0.94.2 on Hadoop 1.0.4)

1. Methide: sequentialWrite
2. From log, I found that the process had to sleep 3 times (total 4012ms) 
3. It scanned .META for max=10rows

Any idea why it got max=10 rows, will this parameter "max=10 rows" slowdown the process?
and why it had to sleep a while?

Thanks



12/11/22 20:36:30 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:30 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 1, sleep for 1003ms!

12/11/22 20:36:31 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000055872,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5
12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. is m147:60020

12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 2, sleep for 1004ms!

12/11/22 20:36:32 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000055872,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5
12/11/22 20:36:32 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. is m147:60020

12/11/22 20:36:33 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:33 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 3, sleep for 2005ms!

12/11/22 20:36:35 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000192832,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5

12/11/22 20:36:35 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000192832,1353587767402.0770fb8c83b64620f29a1e186aa8addb. is m147:60020

Re: HBASE Benchmarking

Posted by lars hofhansl <lh...@yahoo.com>.

Making some wild guesses here.
If your IO system cannot keep up with the write load, eventually it has to block the writers.
For a while your writes are buffered in the memstore(s) but at some point they need to be flushed to disk. Many small files will lead to pad read performance, so these smaller files are compacted into larger files. When the number of files for a region is too large, clients are blocked.
Watch the compaction queue size and the number for storefiles to see if that is the case.
If read performance is not critical, you can increase the number of a allowed files before clients are blocked, you can also increase the memstore multiplier, which temporarily allows a memstore to grow larger than the configured size. Only do this is your writes are bursty. None of this helps when you have a sustained write load that is higher than the what the aggregate IO capacity of your cluster can absorb.


The .META. with 10 rows is probably due to the fact that by default HBase prefetches at most 10 regions of a table.

-- Lars



________________________________
 From: "ac@hsk.hk" <ac...@hsk.hk>
To: user@hbase.apache.org 
Cc: "ac@hsk.hk" <ac...@hsk.hk> 
Sent: Thursday, November 22, 2012 4:49 AM
Subject: HBASE Benchmarking
 
Hi,

I tried to write one million records into the HBASE cluster with 5 nodes (Hbase 0.94.2 on Hadoop 1.0.4)

1. Methide: sequentialWrite
2. From log, I found that the process had to sleep 3 times (total 4012ms) 
3. It scanned .META for max=10rows

Any idea why it got max=10 rows, will this parameter "max=10 rows" slowdown the process?
and why it had to sleep a while?

Thanks



12/11/22 20:36:30 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:30 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 1, sleep for 1003ms!

12/11/22 20:36:31 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000055872,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5
12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. is m147:60020

12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:31 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 2, sleep for 1004ms!

12/11/22 20:36:32 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000055872,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5
12/11/22 20:36:32 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. is m147:60020

12/11/22 20:36:33 DEBUG client.HConnectionManager$HConnectionImplementation: Removed TestTable,0000055872,1353587583943.b1541003edf820761ffe29683fe02df4. for tableName=TestTable from cache because of 0000765900
12/11/22 20:36:33 DEBUG client.HConnectionManager$HConnectionImplementation: Retry 3, sleep for 2005ms!

12/11/22 20:36:35 DEBUG client.MetaScanner: Scanning .META. starting at row=TestTable,0000192832,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2b5356d5

12/11/22 20:36:35 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for TestTable,0000192832,1353587767402.0770fb8c83b64620f29a1e186aa8addb. is m147:60020