You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2013/07/01 18:55:37 UTC

Re: Poor HBase random read performance

You might also be interested in this benchmark I ran 3 months ago:
https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUE&output=html

J-D

On Sat, Jun 29, 2013 at 12:13 PM, Varun Sharma <va...@pinterest.com> wrote:
> Hi,
>
> I was doing some tests on how good HBase random reads are. The setup is
> consists of a 1 node cluster with dfs replication set to 1. Short circuit
> local reads and HBase checksums are enabled. The data set is small enough
> to be largely cached in the filesystem cache - 10G on a 60G machine.
>
> Client sends out multi-get operations in batches to 10 and I try to measure
> throughput.
>
> Test #1
>
> All Data was cached in the block cache.
>
> Test Time = 120 seconds
> Num Read Ops = 12M
>
> Throughput = 100K per second
>
> Test #2
>
> I disable block cache. But now all the data is in the file system cache. I
> verify this by making sure that IOPs on the disk drive are 0 during the
> test. I run the same test with batched ops.
>
> Test Time = 120 seconds
> Num Read Ops = 0.6M
> Throughput = 5K per second
>
> Test #3
>
> I saw all the threads are now stuck in idLock.lockEntry(). So I now run
> with the lock disabled and the block cache disabled.
>
> Test Time = 120 seconds
> Num Read Ops = 1.2M
> Throughput = 10K per second
>
> Test #4
>
> I re enable block cache and this time hack hbase to only cache Index and
> Bloom blocks but data blocks come from File System cache.
>
> Test Time = 120 seconds
> Num Read Ops = 1.6M
> Throughput = 13K per second
>
> So, I wonder how come such a massive drop in throughput. I know that HDFS
> code adds tremendous overhead but this seems pretty high to me. I use
> 0.94.7 and cdh 4.2.0
>
> Thanks
> Varun

Re: Poor HBase random read performance

Posted by Varun Sharma <va...@pinterest.com>.
Yeah, that is a very interesting benchmark. I ran mine on hi1.4xlarge -
almost 4X more CPU than m1.xlarge

In your tests, essentially block cache performance looks close to SCR + OS
PageCache, looking from a latency standpoint. I did not find throughput
numbers in your benchmark.

So I just lowered the block size further to 4K to see if there are more
gains but I found that throughput remains the same at roughly 100K - maybe
slightly higher. But when I reenable the block cache to cache all the data
blocks too. The throughput jumps to 250K+.

I hope to generate some more data for this table and hopefully I can test
with some real data on SSDs.

Thanks
Varun


On Mon, Jul 1, 2013 at 9:55 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> You might also be interested in this benchmark I ran 3 months ago:
>
> https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUE&output=html
>
> J-D
>
> On Sat, Jun 29, 2013 at 12:13 PM, Varun Sharma <va...@pinterest.com>
> wrote:
> > Hi,
> >
> > I was doing some tests on how good HBase random reads are. The setup is
> > consists of a 1 node cluster with dfs replication set to 1. Short circuit
> > local reads and HBase checksums are enabled. The data set is small enough
> > to be largely cached in the filesystem cache - 10G on a 60G machine.
> >
> > Client sends out multi-get operations in batches to 10 and I try to
> measure
> > throughput.
> >
> > Test #1
> >
> > All Data was cached in the block cache.
> >
> > Test Time = 120 seconds
> > Num Read Ops = 12M
> >
> > Throughput = 100K per second
> >
> > Test #2
> >
> > I disable block cache. But now all the data is in the file system cache.
> I
> > verify this by making sure that IOPs on the disk drive are 0 during the
> > test. I run the same test with batched ops.
> >
> > Test Time = 120 seconds
> > Num Read Ops = 0.6M
> > Throughput = 5K per second
> >
> > Test #3
> >
> > I saw all the threads are now stuck in idLock.lockEntry(). So I now run
> > with the lock disabled and the block cache disabled.
> >
> > Test Time = 120 seconds
> > Num Read Ops = 1.2M
> > Throughput = 10K per second
> >
> > Test #4
> >
> > I re enable block cache and this time hack hbase to only cache Index and
> > Bloom blocks but data blocks come from File System cache.
> >
> > Test Time = 120 seconds
> > Num Read Ops = 1.6M
> > Throughput = 13K per second
> >
> > So, I wonder how come such a massive drop in throughput. I know that HDFS
> > code adds tremendous overhead but this seems pretty high to me. I use
> > 0.94.7 and cdh 4.2.0
> >
> > Thanks
> > Varun
>