You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by hm...@tsmc.com on 2011/07/19 02:21:26 UTC

HBase reading performance


Hi there,
HBase read performance works fine in most of the time,
but it took a very long time (over 500 seconds) yesterday while scan a few
records(it takes 2 seconds while HBase is normal)
I don't know how to do at that time; it lasted about 1 hour long, HBase
return to normal status.
So, my question is that any log or tool I can use it to know what's wrong
with HBase at that time.
I am afraid that compaction would do that great impact on read/write
performance. If so, any tunning I can do.
thank you.

Our cluster with 20 machines
Hadoop             20 machines
Zookeeper          3
RegionServer   10
Hadoop version hadoop-0.20.2-cdh3u0
HBase version   hbase-0.90.1-cdh3u0



Fleming Chiu(邱宏明)
Ext: 707-2260
Be Veg, Go Green, Save the Planet!
 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 ---------------------------------------------------------------------------

RE: HBase reading performance

Posted by Michael Segel <mi...@hotmail.com>.

Are you doing scans or are you doing get() with a known key?

There's a big difference and scans are very expensive.

You also don't talk about your hardware. How much memory, how many cores per node, how you have your m/r configured (even if you're not running a m/r job, you still have to account for it.)

There are so many things to look at....


> Subject: HBase reading performance
> To: user@hbase.apache.org
> From: hmchiud@tsmc.com
> Date: Tue, 19 Jul 2011 08:21:26 +0800
> 
> 
> 
> Hi there,
> HBase read performance works fine in most of the time,
> but it took a very long time (over 500 seconds) yesterday while scan a few
> records(it takes 2 seconds while HBase is normal)
> I don't know how to do at that time; it lasted about 1 hour long, HBase
> return to normal status.
> So, my question is that any log or tool I can use it to know what's wrong
> with HBase at that time.
> I am afraid that compaction would do that great impact on read/write
> performance. If so, any tunning I can do.
> thank you.
> 
> Our cluster with 20 machines
> Hadoop             20 machines
> Zookeeper          3
> RegionServer   10
> Hadoop version hadoop-0.20.2-cdh3u0
> HBase version   hbase-0.90.1-cdh3u0
> 
> 
> 
> Fleming Chiu(邱宏明)
> Ext: 707-2260
> Be Veg, Go Green, Save the Planet!
>  --------------------------------------------------------------------------- 
>                                                          TSMC PROPERTY       
>  This email communication (and any attachments) is proprietary information   
>  for the sole use of its                                                     
>  intended recipient. Any unauthorized review, use or distribution by anyone  
>  other than the intended                                                     
>  recipient is strictly prohibited.  If you are not the intended recipient,   
>  please notify the sender by                                                 
>  replying to this email, and then delete this email and any copies of it     
>  immediately. Thank you.                                                     
>  --------------------------------------------------------------------------- 
> 
> 
>

Re: HBase reading performance

Posted by Doug Meil <do...@explorysmedical.com>.

By default, it's going to kick off major compactions every 24 hours.  You
probably want to bump that up to the max interval and manage your major
compactions yourselves (I.e., kick them off exactly when you want them).


I need to make that a little more obvious in the book.



On 7/18/11 10:30 PM, "hmchiud@tsmc.com" <hm...@tsmc.com> wrote:

>
>
>Hi,
>
>Thanks for your help.
>
>Our machines are Dell 1U server with 4 cores and 12G Memory.
>Most of our request is scanning table or secondary index table
>(Transactional componet).
>Yes, it is the default compaction interval.
>Sometimes there are some bulkload map/reduce jobs running while some
>clients are doing scan.
>We wondered two cancelled m/r jobs were the root cause, but it still the
>same after we killed these two processes.
>
>We are now trying to dump oracle data (about 1T ) into HBase using
>bulkload
>then run map/reduce to build secondary index.
>Maybe we can modify compaction interval, any ideas?
>Thank you.
>
>Fleming Chiu(邱宏明)
>Ext: 707-2260
>Be Veg, Go Green, Save the Planet!
>
>
>                  
>                  
>             Doug Meil
>                  
>             <doug.meil@explorysmedica
>                  
>             l.com>
>                                                             To
>                                               "user@hbase.apache.org"
><us...@hbase.apache.org>
>                  
>                                                             cc
>                  
>                  
>             2011/07/19 上午 10:06
>                                                      Subject
>                                               Re: HBase reading
>performance       
>                  
>                  
>                 Please respond to
>                  
>               user@hbase.apache.org
>                  
>                  
>                  
>                  
>                  
>                  
>                  
>
>
>
>
>
>Hi there-
>
>Just taking a stab at something...
>
>
>http://hbase.apache.org/book.html#disable.splitting
>
>... what is your compaction interval?  Is it the default?
>
>
>
>
>On 7/18/11 8:21 PM, "hmchiud@tsmc.com" <hm...@tsmc.com> wrote:
>
>>
>>
>>Hi there,
>>HBase read performance works fine in most of the time,
>>but it took a very long time (over 500 seconds) yesterday while scan a
>>few
>>records(it takes 2 seconds while HBase is normal)
>>I don't know how to do at that time; it lasted about 1 hour long, HBase
>>return to normal status.
>>So, my question is that any log or tool I can use it to know what's wrong
>>with HBase at that time.
>>I am afraid that compaction would do that great impact on read/write
>>performance. If so, any tunning I can do.
>>thank you.
>>
>>Our cluster with 20 machines
>>Hadoop             20 machines
>>Zookeeper          3
>>RegionServer   10
>>Hadoop version hadoop-0.20.2-cdh3u0
>>HBase version   hbase-0.90.1-cdh3u0
>>
>>
>>
>>Fleming Chiu(邱宏明)
>>Ext: 707-2260
>>Be Veg, Go Green, Save the Planet!
>>
>>-------------------------------------------------------------------------
>>-
>>-
>>                                                         TSMC PROPERTY
>>
>> This email communication (and any attachments) is proprietary
>>information
>> for the sole use of its
>>
>> intended recipient. Any unauthorized review, use or distribution by
>>anyone
>> other than the intended
>>
>> recipient is strictly prohibited.  If you are not the intended
>>recipient,
>> please notify the sender by
>>
>> replying to this email, and then delete this email and any copies of it
>>
>> immediately. Thank you.
>>
>>
>>-------------------------------------------------------------------------
>>-
>>-
>>
>>
>>
>
>
>
> 
>--------------------------------------------------------------------------
>- 
>                                                         TSMC PROPERTY
>   
> This email communication (and any attachments) is proprietary
>information   
> for the sole use of its
>   
> intended recipient. Any unauthorized review, use or distribution by
>anyone  
> other than the intended
>   
> recipient is strictly prohibited.  If you are not the intended
>recipient,   
> please notify the sender by
>   
> replying to this email, and then delete this email and any copies of it
>   
> immediately. Thank you.
>   
> 
>--------------------------------------------------------------------------
>- 
>
>
>

Re: HBase reading performance

Posted by hm...@tsmc.com.


Hi,

Thanks for your help.

Our machines are Dell 1U server with 4 cores and 12G Memory.
Most of our request is scanning table or secondary index table
(Transactional componet).
Yes, it is the default compaction interval.
Sometimes there are some bulkload map/reduce jobs running while some
clients are doing scan.
We wondered two cancelled m/r jobs were the root cause, but it still the
same after we killed these two processes.

We are now trying to dump oracle data (about 1T ) into HBase using bulkload
then run map/reduce to build secondary index.
Maybe we can modify compaction interval, any ideas?
Thank you.

Fleming Chiu(邱宏明)
Ext: 707-2260
Be Veg, Go Green, Save the Planet!


                                                                                                                                          
             Doug Meil                                                                                                                    
             <doug.meil@explorysmedica                                                                                                    
             l.com>                                                                                                                    To 
                                               "user@hbase.apache.org" <us...@hbase.apache.org>                                            
                                                                                                                                       cc 
                                                                                                                                          
             2011/07/19 上午 10:06                                                                                                Subject 
                                               Re: HBase reading performance                                                              
                                                                                                                                          
                 Please respond to                                                                                                        
               user@hbase.apache.org                                                                                                      
                                                                                                                                          
                                                                                                                                          
                                                                                                                                          





Hi there-

Just taking a stab at something...


http://hbase.apache.org/book.html#disable.splitting

... what is your compaction interval?  Is it the default?




On 7/18/11 8:21 PM, "hmchiud@tsmc.com" <hm...@tsmc.com> wrote:

>
>
>Hi there,
>HBase read performance works fine in most of the time,
>but it took a very long time (over 500 seconds) yesterday while scan a few
>records(it takes 2 seconds while HBase is normal)
>I don't know how to do at that time; it lasted about 1 hour long, HBase
>return to normal status.
>So, my question is that any log or tool I can use it to know what's wrong
>with HBase at that time.
>I am afraid that compaction would do that great impact on read/write
>performance. If so, any tunning I can do.
>thank you.
>
>Our cluster with 20 machines
>Hadoop             20 machines
>Zookeeper          3
>RegionServer   10
>Hadoop version hadoop-0.20.2-cdh3u0
>HBase version   hbase-0.90.1-cdh3u0
>
>
>
>Fleming Chiu(邱宏明)
>Ext: 707-2260
>Be Veg, Go Green, Save the Planet!
>
>--------------------------------------------------------------------------
>-
>                                                         TSMC PROPERTY
>
> This email communication (and any attachments) is proprietary
>information
> for the sole use of its
>
> intended recipient. Any unauthorized review, use or distribution by
>anyone
> other than the intended
>
> recipient is strictly prohibited.  If you are not the intended
>recipient,
> please notify the sender by
>
> replying to this email, and then delete this email and any copies of it
>
> immediately. Thank you.
>
>
>--------------------------------------------------------------------------
>-
>
>
>



 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 ---------------------------------------------------------------------------

Re: HBase reading performance

Posted by Doug Meil <do...@explorysmedical.com>.

Hi there-

Just taking a stab at something...


http://hbase.apache.org/book.html#disable.splitting

... what is your compaction interval?  Is it the default?




On 7/18/11 8:21 PM, "hmchiud@tsmc.com" <hm...@tsmc.com> wrote:

>
>
>Hi there,
>HBase read performance works fine in most of the time,
>but it took a very long time (over 500 seconds) yesterday while scan a few
>records(it takes 2 seconds while HBase is normal)
>I don't know how to do at that time; it lasted about 1 hour long, HBase
>return to normal status.
>So, my question is that any log or tool I can use it to know what's wrong
>with HBase at that time.
>I am afraid that compaction would do that great impact on read/write
>performance. If so, any tunning I can do.
>thank you.
>
>Our cluster with 20 machines
>Hadoop             20 machines
>Zookeeper          3
>RegionServer   10
>Hadoop version hadoop-0.20.2-cdh3u0
>HBase version   hbase-0.90.1-cdh3u0
>
>
>
>Fleming Chiu(邱宏明)
>Ext: 707-2260
>Be Veg, Go Green, Save the Planet!
> 
>--------------------------------------------------------------------------
>- 
>                                                         TSMC PROPERTY
>   
> This email communication (and any attachments) is proprietary
>information   
> for the sole use of its
>   
> intended recipient. Any unauthorized review, use or distribution by
>anyone  
> other than the intended
>   
> recipient is strictly prohibited.  If you are not the intended
>recipient,   
> please notify the sender by
>   
> replying to this email, and then delete this email and any copies of it
>   
> immediately. Thank you.
>   
> 
>--------------------------------------------------------------------------
>- 
>
>
>