You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jeff Whiting <je...@qualtrics.com> on 2012/02/17 23:38:58 UTC

Slow Get performance, is there a way to profile a Get?

Is there way to profile a specific get request to see where the time is spent (e.g. checking 
memstore, reading from hdfs, etc)?

We are running into a problem where a get after a delete goes really slow.  We have a row that has 
between 100 to 256 MB of data in it across a couple hundred columns.  After putting the data we can 
get the data out quickly (< 100ms).  So a get on "info:name" will take ~0.05110 seconds according 
the hbase shell. We then delete the entire row (e.g. htable.delete(new Delete(rowkey)).  Most of the 
time after deleting the row trying the exact same get on "info:name" becomes significantly slower 
(1.9400 to 3.1840 seconds).  Putting data back into "info:name" still results in the same slow 
performance.  I was hoping to profile the get to see where the time is going and seeing what we can 
do to tune how we are using hbase.

Thanks,
~Jeff


Re: Slow Get performance, is there a way to profile a Get?

Posted by Stack <st...@duboce.net>.
On Fri, Feb 17, 2012 at 2:38 PM, Jeff Whiting <je...@qualtrics.com> wrote:
> Is there way to profile a specific get request to see where the time is
> spent (e.g. checking memstore, reading from hdfs, etc)?
>
> We are running into a problem where a get after a delete goes really slow.
>  We have a row that has between 100 to 256 MB of data in it across a couple
> hundred columns.  After putting the data we can get the data out quickly (<
> 100ms).  So a get on "info:name" will take ~0.05110 seconds according the
> hbase shell. We then delete the entire row (e.g. htable.delete(new
> Delete(rowkey)).  Most of the time after deleting the row trying the exact
> same get on "info:name" becomes significantly slower (1.9400 to 3.1840
> seconds).  Putting data back into "info:name" still results in the same slow
> performance.  I was hoping to profile the get to see where the time is going
> and seeing what we can do to tune how we are using hbase.
>

Can you write yourself a unit test Jeff w/ data that looks like yours?
 I'd think that a delete row would be ok, that you'd not pay too much
for it (as opposed to a delete of each individual entry which would
add a tombstone for all members of the row).

St.Ack

Re: Slow Get performance, is there a way to profile a Get?

Posted by Jeff Whiting <je...@qualtrics.com>.
We are still on 0.90.x.  Since it is just a development cluster we may upgrade to 0.92.x to get the 
slow query facility and see if the condition persists in the newer version.  If we don't upgrade 
we'll try what you are suggesting and pull the content out and do some stand alone testing to see if 
we can reproduce the slowness.

Thanks,
~Jeff

On 2/18/2012 11:32 AM, Stack wrote:
> On Fri, Feb 17, 2012 at 2:38 PM, Jeff Whiting<je...@qualtrics.com>  wrote:
>> Is there way to profile a specific get request to see where the time is
>> spent (e.g. checking memstore, reading from hdfs, etc)?
>>
> In 0.92, there is a slow query facility that dumps out context when
> queries are>  some configured time.  I presume you are on 0.90.x
>
>> We are running into a problem where a get after a delete goes really slow.
>>   We have a row that has between 100 to 256 MB of data in it across a couple
>> hundred columns.  After putting the data we can get the data out quickly (<
>> 100ms).  So a get on "info:name" will take ~0.05110 seconds according the
>> hbase shell. We then delete the entire row (e.g. htable.delete(new
>> Delete(rowkey)).  Most of the time after deleting the row trying the exact
>> same get on "info:name" becomes significantly slower (1.9400 to 3.1840
>> seconds).  Putting data back into "info:name" still results in the same slow
>> performance.  I was hoping to profile the get to see where the time is going
>> and seeing what we can do to tune how we are using hbase.
>>
> If you flush the region -- you can do this from the shell -- is it
> still slow?  If so, it means slowness is from accessing hfiles.  Try
> copying the region content out and rig up a little harness to bring
> the region in a context free from the running cluser.  See TestHRegion
> for sample code on how to stand up a HRegion instance.
>
> St.Ack

-- 
Jeff Whiting
Qualtrics Senior Software Engineer
jeffw@qualtrics.com


Re: Slow Get performance, is there a way to profile a Get?

Posted by Stack <st...@duboce.net>.
On Fri, Feb 17, 2012 at 2:38 PM, Jeff Whiting <je...@qualtrics.com> wrote:
> Is there way to profile a specific get request to see where the time is
> spent (e.g. checking memstore, reading from hdfs, etc)?
>

In 0.92, there is a slow query facility that dumps out context when
queries are > some configured time.  I presume you are on 0.90.x

> We are running into a problem where a get after a delete goes really slow.
>  We have a row that has between 100 to 256 MB of data in it across a couple
> hundred columns.  After putting the data we can get the data out quickly (<
> 100ms).  So a get on "info:name" will take ~0.05110 seconds according the
> hbase shell. We then delete the entire row (e.g. htable.delete(new
> Delete(rowkey)).  Most of the time after deleting the row trying the exact
> same get on "info:name" becomes significantly slower (1.9400 to 3.1840
> seconds).  Putting data back into "info:name" still results in the same slow
> performance.  I was hoping to profile the get to see where the time is going
> and seeing what we can do to tune how we are using hbase.
>

If you flush the region -- you can do this from the shell -- is it
still slow?  If so, it means slowness is from accessing hfiles.  Try
copying the region content out and rig up a little harness to bring
the region in a context free from the running cluser.  See TestHRegion
for sample code on how to stand up a HRegion instance.

St.Ack