You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2011/05/02 21:14:46 UTC

Re: Performance test results

It might be the slow memstore issue... after inserting your dataset
issue a flush on your table in the shell, wait a few seconds, then
start reading. Someone else on the mailing list recently saw this type
of issue.

Regarding the block caching logging, here's what I see in my logs:

2011-05-02 10:05:38,718 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction started; Attempting to free 303.77 MB of total=2.52 GB
2011-05-02 10:05:38,751 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction completed; freed=303.8 MB, total=2.22 GB, single=755.67 MB,
multi=1.76 GB, memory=0 KB
2011-05-02 10:07:18,737 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.27
GB, free=718.03 MB, max=2.97 GB, blocks=36450, accesses=1056364760,
hits=939002423, hitRatio=88.88%%, cachingAccesses=967172747,
cachingHits=932095548, cachingHitsRatio=96.37%%, evictions=7801,
evicted=35040749, evictedPerRun=4491.8276367187

Keep in mind that currently we don't have like a moving average for
the percentages so at some point those numbers are set in stone...

The handler config is only good if you are using a ton of clients,
which doesn't seem to be the case (at least now).

J-D

On Wed, Apr 27, 2011 at 6:42 AM, Eran Kutner <eran@> wrote:
> I must say the more I play with it the more baffled I am with the
> results. I ran the read test again today after not touching the
> cluster for a couple of days and now I'm getting the same high read
> numbers (10-11K reads/sec per server with some server reaching even
> 15K r/s) if I read 1, 10, 100 or even 1000 rows from every key space,
> however 5000 rows yielded a read rate of only 3K rows per second, even
> after a very long time. Just to be clear I'm always random reading a
> single row in every request, the number of rows I'm talking about are
> the ranges of rows within each key space that I'm randomly selecting
> my keys from.
>
> St.Ack - to answer your questions:
>
> Writing from two machines increased the total number of writes per
> second by about 10%, maybe less. Reads showed 15-20% increase when ran
> from 2 machines.
>
> I already had most of the performance tuning recommendations
> implemented (garbage collection, using the new memory slabs feature,
> using LZO) when I ran my previous test, the only config I didn't have
> is "hbase.regionserver.handler.count", I changed it to 128, or 16
> threads per core, which seems like a reasonable number and tried
> inserting to the same key ranges as before, it didn't seem to have
> made any difference in the total performance.
>
> My keys are about 15 bytes long.
>
> As for caching I can't find those cache hit ratio numbers in my logs,
> do they require a special parameter to enable them? That said, my
> calculations show that the entire data set I'm randomly reading should
> easily fit in the servers memory. Each row has 15 bytes of key + 128
> bytes of data + overhead - let's say 200 bytes. If I'm reading 5000
> rows from each key space and have a total of 100 key spaces that's
> 100*5000*200=100000000B=100MB. This is spread across 5 servers with
> 16GB of RAM, out of which 12.5GB are allocated to the region servers.
>
> -eran

Re: Performance test results

Posted by Eran Kutner <er...@gigya.com>.

I tried flushing the table, not a specific region.

-eran



On Mon, May 9, 2011 at 20:03, Stack <st...@duboce.net> wrote:

> On Mon, May 9, 2011 at 9:31 AM, Eran Kutner <er...@gigya.com> wrote:
> > OK, I tried it, truncated the table and ran inserts for about a day. Now
> I
> > tried flushing the table but I get a "Region is not online" error,
> although
> > all the servers are up, no regions are in transition and as far as I can
> > tell all the regions seem up.
>
> You will get this message if you incorrectly specified the regionname.
>  Is that possible?
>
> >I can even read rows which are supposedly in
> > the offline region (I'm assuming the region name indicates the first key
> in
> > the region).
> >
>
> The middle portion of the regionname is indeed its startkey.  Scan
> '.META.' in shell and it will dump out info that includes start and
> end keys.
>
> St.Ack
>

Re: Performance test results

Posted by Stack <st...@duboce.net>.

On Mon, May 9, 2011 at 9:31 AM, Eran Kutner <er...@gigya.com> wrote:
> OK, I tried it, truncated the table and ran inserts for about a day. Now I
> tried flushing the table but I get a "Region is not online" error, although
> all the servers are up, no regions are in transition and as far as I can
> tell all the regions seem up.

You will get this message if you incorrectly specified the regionname.
 Is that possible?

>I can even read rows which are supposedly in
> the offline region (I'm assuming the region name indicates the first key in
> the region).
>

The middle portion of the regionname is indeed its startkey.  Scan
'.META.' in shell and it will dump out info that includes start and
end keys.

St.Ack

Re: Performance test results

Posted by Eran Kutner <er...@gigya.com>.

OK, I tried it, truncated the table and ran inserts for about a day. Now I
tried flushing the table but I get a "Region is not online" error, although
all the servers are up, no regions are in transition and as far as I can
tell all the regions seem up. I can even read rows which are supposedly in
the offline region (I'm assuming the region name indicates the first key in
the region).

-eran



On Wed, May 4, 2011 at 15:20, Eran Kutner <er...@gigya.com> wrote:

> J-D,
> I'll try what you suggest but it is worth pointing out that my data set has
> over 300M rows, however in my read test I am random reading out of a subset
> that contains only 0.5M rows (5000 rows in each of the 100 key ranges in the
> table).
>
> -eran
>
>
>
> On Tue, May 3, 2011 at 23:29, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> On Tue, May 3, 2011 at 6:20 AM, Eran Kutner <er...@gigya.com> wrote:
>> > Flushing, at least when I try it now, long after I stopped writing,
>> doesn't
>> > seem to have any effect.
>>
>> Bummer.
>>
>> >
>> > In my log I see this:
>> > 2011-05-03 08:57:55,384 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39
>> GB,
>> > free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811,
>> hits=75769916,
>> > hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
>> > cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
>> > evictedPerRun=6949.0791015625
>> >
>> > and every 30 seconds or so something like this:
>> > 2011-05-03 08:58:07,900 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 436.92 MB of total=3.63 GB
>> > 2011-05-03 08:58:07,947 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68
>> GB,
>> > memory=3.69 KB
>> >
>> > Now, if the entire working set I'm reading is 100MB in size, why would
>> it
>> > have to evict 436MB just to get it filled back in 30 seconds?
>>
>> I was about to ask the same question... from what I can tell from the
>> this log, it seems that your working dataset is much larger than 3GB
>> (the fact that it's evicting means it could be a lot more) and that's
>> only on that region server.
>>
>> First reason that comes in mind on why it would be so much bigger is
>> that you would have uploaded your dataset more than once and since
>> HBase keeps versions of the data, it could accumulate. That doesn't
>> explain how it would grow into GBs since by default a family only
>> keeps 3 versions... unless you set that higher than the default or you
>> uploaded the same data tens of times within 24 hours and the major
>> compactions didn't kick in.
>>
>> In any case, it would be interesting that you:
>>
>>  - truncate the table
>>  - re-import the data
>>  - force a flush
>>  - wait a bit until the flushes are done (should take 2-3 seconds if
>> your dataset is really 100MB)
>>  - do a "hadoop dfs -dus" on the table's directory (should be under/hbase)
>>  - if the number is way out of whack, review how you are inserting
>> your data. Either way, please report back.
>>
>> >
>> > Also, what is a good value for hfile.block.cache.size (I have it now on
>> .35)
>> > but with 12.5GB of RAM available for the region servers it seem I should
>> be
>> > able to get it much higher.
>>
>> Depends, you also have to account for the MemStores which by default
>> can use up to 40% of the heap
>> (hbase.regionserver.global.memstore.upperLimit) leaving currently for
>> you only 100-40-35=25% of the heap to do stuff like serving requests,
>> compacting, flushing, etc. It's hard to give a good number for what
>> should be left to the rest of HBase tho...
>>
>
>

Re: Performance test results

Posted by Eran Kutner <er...@gigya.com>.

J-D,
I'll try what you suggest but it is worth pointing out that my data set has
over 300M rows, however in my read test I am random reading out of a subset
that contains only 0.5M rows (5000 rows in each of the 100 key ranges in the
table).

-eran



On Tue, May 3, 2011 at 23:29, Jean-Daniel Cryans <jd...@apache.org>wrote:

> On Tue, May 3, 2011 at 6:20 AM, Eran Kutner <er...@gigya.com> wrote:
> > Flushing, at least when I try it now, long after I stopped writing,
> doesn't
> > seem to have any effect.
>
> Bummer.
>
> >
> > In my log I see this:
> > 2011-05-03 08:57:55,384 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB,
> > free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811,
> hits=75769916,
> > hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
> > cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
> > evictedPerRun=6949.0791015625
> >
> > and every 30 seconds or so something like this:
> > 2011-05-03 08:58:07,900 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 436.92 MB of total=3.63 GB
> > 2011-05-03 08:58:07,947 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68
> GB,
> > memory=3.69 KB
> >
> > Now, if the entire working set I'm reading is 100MB in size, why would it
> > have to evict 436MB just to get it filled back in 30 seconds?
>
> I was about to ask the same question... from what I can tell from the
> this log, it seems that your working dataset is much larger than 3GB
> (the fact that it's evicting means it could be a lot more) and that's
> only on that region server.
>
> First reason that comes in mind on why it would be so much bigger is
> that you would have uploaded your dataset more than once and since
> HBase keeps versions of the data, it could accumulate. That doesn't
> explain how it would grow into GBs since by default a family only
> keeps 3 versions... unless you set that higher than the default or you
> uploaded the same data tens of times within 24 hours and the major
> compactions didn't kick in.
>
> In any case, it would be interesting that you:
>
>  - truncate the table
>  - re-import the data
>  - force a flush
>  - wait a bit until the flushes are done (should take 2-3 seconds if
> your dataset is really 100MB)
>  - do a "hadoop dfs -dus" on the table's directory (should be under/hbase)
>  - if the number is way out of whack, review how you are inserting
> your data. Either way, please report back.
>
> >
> > Also, what is a good value for hfile.block.cache.size (I have it now on
> .35)
> > but with 12.5GB of RAM available for the region servers it seem I should
> be
> > able to get it much higher.
>
> Depends, you also have to account for the MemStores which by default
> can use up to 40% of the heap
> (hbase.regionserver.global.memstore.upperLimit) leaving currently for
> you only 100-40-35=25% of the heap to do stuff like serving requests,
> compacting, flushing, etc. It's hard to give a good number for what
> should be left to the rest of HBase tho...
>

Re: Performance test results

Posted by Jean-Daniel Cryans <jd...@apache.org>.

On Tue, May 3, 2011 at 6:20 AM, Eran Kutner <er...@gigya.com> wrote:
> Flushing, at least when I try it now, long after I stopped writing, doesn't
> seem to have any effect.

Bummer.

>
> In my log I see this:
> 2011-05-03 08:57:55,384 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB,
> free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811, hits=75769916,
> hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
> cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
> evictedPerRun=6949.0791015625
>
> and every 30 seconds or so something like this:
> 2011-05-03 08:58:07,900 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 436.92 MB of total=3.63 GB
> 2011-05-03 08:58:07,947 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68 GB,
> memory=3.69 KB
>
> Now, if the entire working set I'm reading is 100MB in size, why would it
> have to evict 436MB just to get it filled back in 30 seconds?

I was about to ask the same question... from what I can tell from the
this log, it seems that your working dataset is much larger than 3GB
(the fact that it's evicting means it could be a lot more) and that's
only on that region server.

First reason that comes in mind on why it would be so much bigger is
that you would have uploaded your dataset more than once and since
HBase keeps versions of the data, it could accumulate. That doesn't
explain how it would grow into GBs since by default a family only
keeps 3 versions... unless you set that higher than the default or you
uploaded the same data tens of times within 24 hours and the major
compactions didn't kick in.

In any case, it would be interesting that you:

 - truncate the table
 - re-import the data
 - force a flush
 - wait a bit until the flushes are done (should take 2-3 seconds if
your dataset is really 100MB)
 - do a "hadoop dfs -dus" on the table's directory (should be under/hbase)
 - if the number is way out of whack, review how you are inserting
your data. Either way, please report back.

>
> Also, what is a good value for hfile.block.cache.size (I have it now on .35)
> but with 12.5GB of RAM available for the region servers it seem I should be
> able to get it much higher.

Depends, you also have to account for the MemStores which by default
can use up to 40% of the heap
(hbase.regionserver.global.memstore.upperLimit) leaving currently for
you only 100-40-35=25% of the heap to do stuff like serving requests,
compacting, flushing, etc. It's hard to give a good number for what
should be left to the rest of HBase tho...

Re: Performance test results

Posted by Eran Kutner <er...@gigya.com>.

Flushing, at least when I try it now, long after I stopped writing, doesn't
seem to have any effect.

In my log I see this:
2011-05-03 08:57:55,384 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB,
free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811, hits=75769916,
hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
evictedPerRun=6949.0791015625

and every 30 seconds or so something like this:
2011-05-03 08:58:07,900 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 436.92 MB of total=3.63 GB
2011-05-03 08:58:07,947 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68 GB,
memory=3.69 KB

Now, if the entire working set I'm reading is 100MB in size, why would it
have to evict 436MB just to get it filled back in 30 seconds?

Also, what is a good value for hfile.block.cache.size (I have it now on .35)
but with 12.5GB of RAM available for the region servers it seem I should be
able to get it much higher.

-eran




On Mon, May 2, 2011 at 22:14, Jean-Daniel Cryans <jd...@apache.org>
wrote:
> It might be the slow memstore issue... after inserting your dataset
> issue a flush on your table in the shell, wait a few seconds, then
> start reading. Someone else on the mailing list recently saw this type
> of issue.
>
> Regarding the block caching logging, here's what I see in my logs:
>
> 2011-05-02 10:05:38,718 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction started; Attempting to free 303.77 MB of total=2.52 GB
> 2011-05-02 10:05:38,751 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction completed; freed=303.8 MB, total=2.22 GB, single=755.67 MB,
> multi=1.76 GB, memory=0 KB
> 2011-05-02 10:07:18,737 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.27
> GB, free=718.03 MB, max=2.97 GB, blocks=36450, accesses=1056364760,
> hits=939002423, hitRatio=88.88%%, cachingAccesses=967172747,
> cachingHits=932095548, cachingHitsRatio=96.37%%, evictions=7801,
> evicted=35040749, evictedPerRun=4491.8276367187
>
> Keep in mind that currently we don't have like a moving average for
> the percentages so at some point those numbers are set in stone...
>
> The handler config is only good if you are using a ton of clients,
> which doesn't seem to be the case (at least now).
>
> J-D
>
> On Wed, Apr 27, 2011 at 6:42 AM, Eran Kutner <eran@> wrote:
>> I must say the more I play with it the more baffled I am with the
>> results. I ran the read test again today after not touching the
>> cluster for a couple of days and now I'm getting the same high read
>> numbers (10-11K reads/sec per server with some server reaching even
>> 15K r/s) if I read 1, 10, 100 or even 1000 rows from every key space,
>> however 5000 rows yielded a read rate of only 3K rows per second, even
>> after a very long time. Just to be clear I'm always random reading a
>> single row in every request, the number of rows I'm talking about are
>> the ranges of rows within each key space that I'm randomly selecting
>> my keys from.
>>
>> St.Ack - to answer your questions:
>>
>> Writing from two machines increased the total number of writes per
>> second by about 10%, maybe less. Reads showed 15-20% increase when ran
>> from 2 machines.
>>
>> I already had most of the performance tuning recommendations
>> implemented (garbage collection, using the new memory slabs feature,
>> using LZO) when I ran my previous test, the only config I didn't have
>> is "hbase.regionserver.handler.count", I changed it to 128, or 16
>> threads per core, which seems like a reasonable number and tried
>> inserting to the same key ranges as before, it didn't seem to have
>> made any difference in the total performance.
>>
>> My keys are about 15 bytes long.
>>
>> As for caching I can't find those cache hit ratio numbers in my logs,
>> do they require a special parameter to enable them? That said, my
>> calculations show that the entire data set I'm randomly reading should
>> easily fit in the servers memory. Each row has 15 bytes of key + 128
>> bytes of data + overhead - let's say 200 bytes. If I'm reading 5000
>> rows from each key space and have a total of 100 key spaces that's
>> 100*5000*200=100000000B=100MB. This is spread across 5 servers with
>> 16GB of RAM, out of which 12.5GB are allocated to the region servers.
>>
>> -eran
>