You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Schubert Zhang <zs...@gmail.com> on 2009/08/17 23:02:53 UTC

HBase-0.20.0 Performance Evaluation

We have just done a Performance Evaluation on HBase-0.20.0.
Refers to:
http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

The link corrupt, please use this one.
http://issues.apache.org/jira/browse/HBASE-1778<http://issues.apache.org/jira/browse/HBASE-1778>

On Mon, Aug 24, 2009 at 9:50 PM, Schubert Zhang <zs...@gmail.com> wrote:

> The patch of performance evaluation for 0.20.0 is available @
> http://issues.apache.org/jira/browse/HBASE‐1778.
> With the comments of stack, JD, JG, ryan..., we generated a new test
> report, you can also download it this report from above jira link. Please
> have a review and give you comments.
> Schubert
>   On Wed, Aug 19, 2009 at 4:17 AM, Ryan Rawson <ry...@gmail.com> wrote:
>
>> Sounds like you are running into RAM issues. Remember, 4gb of ram is
>> what I have in my consumer Mac Book (white).  I would personally like
>> to outfit machines with 2-4gb per CORE.
>>
>> Jgray is right on here, the Java CMS GC trades time for memory, and
>> thus it requires more ram to keep GC pauses low. If you are allocating
>> 1/2 your ram to HBase, then you have precious little for the datanode
>> and any buffer cache you might need.
>>
>> Try running datanodes and regionservers not on the same machines as
>> one option. You could buy different machine configurations, one with
>> large disk, one with less. Or go with modern 8core, 16gb ram machines.
>>
>> good luck,
>> -ryan
>>
>> On Tue, Aug 18, 2009 at 2:35 PM, Schubert Zhang<zs...@gmail.com> wrote:
>> > @JG and @stack
>> >
>> > Helpful!
>> >
>> > runing RS with 2GB is because we have a heterogeneous node(the slave-5),
>> > which has only 4GB RAM.
>> > Now, I temporarily removed this node from the cluster. Then we got the
>> ~2ms
>> > random-read now. It is fine now.
>> >
>> > Thank you very much.
>> > On Wed, Aug 19, 2009 at 2:52 AM, Jonathan Gray <jl...@streamy.com>
>> wrote:
>> >
>> >> As stack says, but more strongly, if you have 4+ cores then you
>> definitely
>> >> want to turn off incremental mode.  Is there a reason you're running
>> your RS
>> >> with 2GB given that you have 8GB of total memory?  I'd up it to 4GB,
>> after I
>> >> did that on our production cluster things ran much more smoothly with
>> CMS.
>> >>
>> >> I'd also drop your swappiness to 0, I've not heard a good argument for
>> when
>> >> we ever want to swap on an HBase/Hadoop cluster.  If you end up
>> swapping,
>> >> you're going to start seeing some weird behavior and very slow GC runs,
>> and
>> >> likely killing off regionservers as ZK times out and assumes the RS is
>> dead.
>> >>
>> >>
>> >>
>> >> stack wrote:
>> >>
>> >>> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or
>> less
>> >>> CPUs according to
>> >>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.
>>  You
>> >>> might try without this.
>> >>>
>> >>>
>> >>> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM,
>> 6
>> >>>
>> >>>> SATA-RAID1, has problem.
>> >>>>
>> >>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>> >>>>              7.46     0.00     3.28       23.11      0.00       66.15
>> >>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
>> avgrq-sz
>> >>>> avgqu-sz   await  svctm  %util
>> >>>> sda              84.83    25.12 485.57  2.49 53649.75   220.90
>> 110.38
>> >>>> 9.20   18.85   2.04  99.53
>> >>>> dm-0              0.00     0.00  0.00 25.12     0.00   201.00
>> 8.00
>> >>>> 0.01    0.27   0.01   0.02
>> >>>> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90
>>  93.61
>> >>>> 10.74   18.72   1.74  99.53
>> >>>>
>> >>>> It seems the disk I/O is very busy.
>> >>>>
>> >>>>
>> >>> Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?
>> >>>
>> >>> St.Ack
>> >>>
>> >>>
>> >
>>
>
>

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

The patch of performance evaluation for 0.20.0 is available @
http://issues.apache.org/jira/browse/HBASE‐1778.
With the comments of stack, JD, JG, ryan..., we generated a new test report,
you can also download it this report from above jira link. Please have a
review and give you comments.
Schubert
On Wed, Aug 19, 2009 at 4:17 AM, Ryan Rawson <ry...@gmail.com> wrote:

> Sounds like you are running into RAM issues. Remember, 4gb of ram is
> what I have in my consumer Mac Book (white).  I would personally like
> to outfit machines with 2-4gb per CORE.
>
> Jgray is right on here, the Java CMS GC trades time for memory, and
> thus it requires more ram to keep GC pauses low. If you are allocating
> 1/2 your ram to HBase, then you have precious little for the datanode
> and any buffer cache you might need.
>
> Try running datanodes and regionservers not on the same machines as
> one option. You could buy different machine configurations, one with
> large disk, one with less. Or go with modern 8core, 16gb ram machines.
>
> good luck,
> -ryan
>
> On Tue, Aug 18, 2009 at 2:35 PM, Schubert Zhang<zs...@gmail.com> wrote:
> > @JG and @stack
> >
> > Helpful!
> >
> > runing RS with 2GB is because we have a heterogeneous node(the slave-5),
> > which has only 4GB RAM.
> > Now, I temporarily removed this node from the cluster. Then we got the
> ~2ms
> > random-read now. It is fine now.
> >
> > Thank you very much.
> > On Wed, Aug 19, 2009 at 2:52 AM, Jonathan Gray <jl...@streamy.com>
> wrote:
> >
> >> As stack says, but more strongly, if you have 4+ cores then you
> definitely
> >> want to turn off incremental mode.  Is there a reason you're running
> your RS
> >> with 2GB given that you have 8GB of total memory?  I'd up it to 4GB,
> after I
> >> did that on our production cluster things ran much more smoothly with
> CMS.
> >>
> >> I'd also drop your swappiness to 0, I've not heard a good argument for
> when
> >> we ever want to swap on an HBase/Hadoop cluster.  If you end up
> swapping,
> >> you're going to start seeing some weird behavior and very slow GC runs,
> and
> >> likely killing off regionservers as ZK times out and assumes the RS is
> dead.
> >>
> >>
> >>
> >> stack wrote:
> >>
> >>> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or
> less
> >>> CPUs according to
> >>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.
>  You
> >>> might try without this.
> >>>
> >>>
> >>> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM,
> 6
> >>>
> >>>> SATA-RAID1, has problem.
> >>>>
> >>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> >>>>              7.46     0.00     3.28       23.11      0.00       66.15
> >>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> >>>> avgqu-sz   await  svctm  %util
> >>>> sda              84.83    25.12 485.57  2.49 53649.75   220.90
> 110.38
> >>>> 9.20   18.85   2.04  99.53
> >>>> dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
> >>>> 0.01    0.27   0.01   0.02
> >>>> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90
>  93.61
> >>>> 10.74   18.72   1.74  99.53
> >>>>
> >>>> It seems the disk I/O is very busy.
> >>>>
> >>>>
> >>> Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?
> >>>
> >>> St.Ack
> >>>
> >>>
> >
>

Re: HBase-0.20.0 Performance Evaluation

Posted by Ryan Rawson <ry...@gmail.com>.

Sounds like you are running into RAM issues. Remember, 4gb of ram is
what I have in my consumer Mac Book (white).  I would personally like
to outfit machines with 2-4gb per CORE.

Jgray is right on here, the Java CMS GC trades time for memory, and
thus it requires more ram to keep GC pauses low. If you are allocating
1/2 your ram to HBase, then you have precious little for the datanode
and any buffer cache you might need.

Try running datanodes and regionservers not on the same machines as
one option. You could buy different machine configurations, one with
large disk, one with less. Or go with modern 8core, 16gb ram machines.

good luck,
-ryan

On Tue, Aug 18, 2009 at 2:35 PM, Schubert Zhang<zs...@gmail.com> wrote:
> @JG and @stack
>
> Helpful!
>
> runing RS with 2GB is because we have a heterogeneous node(the slave-5),
> which has only 4GB RAM.
> Now, I temporarily removed this node from the cluster. Then we got the ~2ms
> random-read now. It is fine now.
>
> Thank you very much.
> On Wed, Aug 19, 2009 at 2:52 AM, Jonathan Gray <jl...@streamy.com> wrote:
>
>> As stack says, but more strongly, if you have 4+ cores then you definitely
>> want to turn off incremental mode.  Is there a reason you're running your RS
>> with 2GB given that you have 8GB of total memory?  I'd up it to 4GB, after I
>> did that on our production cluster things ran much more smoothly with CMS.
>>
>> I'd also drop your swappiness to 0, I've not heard a good argument for when
>> we ever want to swap on an HBase/Hadoop cluster.  If you end up swapping,
>> you're going to start seeing some weird behavior and very slow GC runs, and
>> likely killing off regionservers as ZK times out and assumes the RS is dead.
>>
>>
>>
>> stack wrote:
>>
>>> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or less
>>> CPUs according to
>>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.  You
>>> might try without this.
>>>
>>>
>>> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
>>>
>>>> SATA-RAID1, has problem.
>>>>
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>              7.46     0.00     3.28       23.11      0.00       66.15
>>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>>>> avgqu-sz   await  svctm  %util
>>>> sda              84.83    25.12 485.57  2.49 53649.75   220.90   110.38
>>>> 9.20   18.85   2.04  99.53
>>>> dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
>>>> 0.01    0.27   0.01   0.02
>>>> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90    93.61
>>>> 10.74   18.72   1.74  99.53
>>>>
>>>> It seems the disk I/O is very busy.
>>>>
>>>>
>>> Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?
>>>
>>> St.Ack
>>>
>>>
>

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

@JG and @stack

Helpful!

runing RS with 2GB is because we have a heterogeneous node(the slave-5),
which has only 4GB RAM.
Now, I temporarily removed this node from the cluster. Then we got the ~2ms
random-read now. It is fine now.

Thank you very much.
On Wed, Aug 19, 2009 at 2:52 AM, Jonathan Gray <jl...@streamy.com> wrote:

> As stack says, but more strongly, if you have 4+ cores then you definitely
> want to turn off incremental mode.  Is there a reason you're running your RS
> with 2GB given that you have 8GB of total memory?  I'd up it to 4GB, after I
> did that on our production cluster things ran much more smoothly with CMS.
>
> I'd also drop your swappiness to 0, I've not heard a good argument for when
> we ever want to swap on an HBase/Hadoop cluster.  If you end up swapping,
> you're going to start seeing some weird behavior and very slow GC runs, and
> likely killing off regionservers as ZK times out and assumes the RS is dead.
>
>
>
> stack wrote:
>
>> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or less
>> CPUs according to
>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.  You
>> might try without this.
>>
>>
>> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
>>
>>> SATA-RAID1, has problem.
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>              7.46     0.00     3.28       23.11      0.00       66.15
>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda              84.83    25.12 485.57  2.49 53649.75   220.90   110.38
>>> 9.20   18.85   2.04  99.53
>>> dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
>>> 0.01    0.27   0.01   0.02
>>> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90    93.61
>>> 10.74   18.72   1.74  99.53
>>>
>>> It seems the disk I/O is very busy.
>>>
>>>
>> Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?
>>
>> St.Ack
>>
>>

Re: HBase-0.20.0 Performance Evaluation

Posted by Jonathan Gray <jl...@streamy.com>.

As stack says, but more strongly, if you have 4+ cores then you 
definitely want to turn off incremental mode.  Is there a reason you're 
running your RS with 2GB given that you have 8GB of total memory?  I'd 
up it to 4GB, after I did that on our production cluster things ran much 
more smoothly with CMS.

I'd also drop your swappiness to 0, I've not heard a good argument for 
when we ever want to swap on an HBase/Hadoop cluster.  If you end up 
swapping, you're going to start seeing some weird behavior and very slow 
GC runs, and likely killing off regionservers as ZK times out and 
assumes the RS is dead.

stack wrote:
> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or less
> CPUs according to
> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.  You
> might try without this.
> 
> 
> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
>> SATA-RAID1, has problem.
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>               7.46     0.00     3.28       23.11      0.00       66.15
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> sda              84.83    25.12 485.57  2.49 53649.75   220.90   110.38
>> 9.20   18.85   2.04  99.53
>> dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
>> 0.01    0.27   0.01   0.02
>> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90    93.61
>> 10.74   18.72   1.74  99.53
>>
>> It seems the disk I/O is very busy.
>>
> 
> Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?
> 
> St.Ack
>

Re: HBase-0.20.0 Performance Evaluation

Posted by stack <st...@duboce.net>.

"-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or less
CPUs according to
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html.  You
might try without this.


But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
> SATA-RAID1, has problem.
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>               7.46     0.00     3.28       23.11      0.00       66.15
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda              84.83    25.12 485.57  2.49 53649.75   220.90   110.38
> 9.20   18.85   2.04  99.53
> dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
> 0.01    0.27   0.01   0.02
> dm-1              0.00     0.00 570.90  2.49 53655.72    19.90    93.61
> 10.74   18.72   1.74  99.53
>
> It seems the disk I/O is very busy.
>

Yeah.  Whats writing?  Can you tell?  Is it NN or ZK node?

St.Ack

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

My GC oprions:  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
Yes, it is >8ms now for random read.

In my previous report, the random-read evaluation was started soon after of
the sequential-write(from empty) evaluation. The result of such case is
still good.

We have Ganglia, but I cannot access it since I am remote accessing my
cluster.  And the swap vm.swappiness=20 to minimize swap.

I just check more things carefully by commands (top, free, iostat, etc.),
and find following issue:

On my slave node (1-4) which has 4CPU cores and 8GB RAM, 2 SATA. The memory
and regionserver heap are both adequate. CPU is not very busy. Seems every
thing is ok.

 PID  USER      PR  NI  VIRT    RES  SHR S  %CPU %MEM    TIME+   COMMAND
 4398 schubert  23   0    2488m 2.0g  9.9m S   38      26.2        59:06.21
 java (it is region server)


But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
SATA-RAID1, has problem.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               7.46     0.00     3.28       23.11      0.00       66.15
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda              84.83    25.12 485.57  2.49 53649.75   220.90   110.38
9.20   18.85   2.04  99.53
dm-0              0.00     0.00  0.00 25.12     0.00   201.00     8.00
0.01    0.27   0.01   0.02
dm-1              0.00     0.00 570.90  2.49 53655.72    19.90    93.61
10.74   18.72   1.74  99.53

It seems the disk I/O is very busy.
And top:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
21037 schubert  20   0 2628m 1.6g  10m S   48     41.3          71:24.74
java

I will check more when I go to company in the moning.
On Wed, Aug 19, 2009 at 1:36 AM, stack <st...@duboce.net> wrote:

> What do you have for GC config Schubert?  Now its 8ms a random read?
> St.Ack
>
> On Tue, Aug 18, 2009 at 10:28 AM, Jonathan Gray <jl...@streamy.com> wrote:
>
> > Schubert,
> >
> > I can't think of any reason your random reads would get slower after
> > inserting more data, besides GC issues.
> >
> > Do you have GC logging and JVM metrics logging turned on?  I would
> inspect
> > those to see if you have any long-running GC pauses, or just lots and
> lots
> > of GC going on.
> >
> > If I recall, you are running on 4GB nodes, 2GB RS heap, and cohosted
> > DataNodes and TaskTrackers.  We ran for a long time on a similar setup,
> but
> > once we moved to 0.20 (and to the CMS garbage collector), we really
> needed
> > to add more memory to the nodes and increase RS heap to 4 or 5GB.  The
> CMS
> > GC is less efficient in memory, but if given sufficient resources, is
> much
> > better for overall performance/throughput.
> >
> > Also, do you have Ganglia setup?  Are you seeing swapping on your RS
> nodes?
> >  Is there high IO-wait CPU usage?
> >
> > JG
> >
> >
> > Schubert Zhang wrote:
> >
> >> Addition.
> >> Only random-reads become very slow, scans and sequential-reads are ok.
> >>
> >>
> >> On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang <zs...@gmail.com>
> >> wrote:
> >>
> >>  stack and J-G, Thank you very much for your helpful comment.
> >>>
> >>> But now, we find such a critical issue for random reads.
> >>> I use sequentical-writes to insert 5GB of data in our HBase table from
> >>> empty, and ~30 regions are generated. Then the random-reads takes about
> >>> 30
> >>> minutes to complete. And then, I run the sequentical-writes again.
> Thus,
> >>> another version of each cell are inserted, thus ~60 regions are
> >>> generated.
> >>> But, we I ran the random-reads again to this table, it always take long
> >>> time
> >>> (more than 2 hours).
> >>>
> >>> I check the heap usage and other metrics, does not find the reason.
> >>>
> >>> Bellow is the status of one region server:
> >>> request=0.0, regions=13, stores=13, storefiles=14,
> storefileIndexSize=2,
> >>> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
> >>> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55
> >>>
> >>> Schubert
> >>>
> >>>
> >>> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <zs...@gmail.com>
> >>> wrote:
> >>>
> >>>  We have just done a Performance Evaluation on HBase-0.20.0.
> >>>> Refers to:
> >>>>
> >>>>
> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
> >>>>
> >>>>
> >>>
> >>
>

Re: HBase-0.20.0 Performance Evaluation

Posted by stack <st...@duboce.net>.

What do you have for GC config Schubert?  Now its 8ms a random read?
St.Ack

On Tue, Aug 18, 2009 at 10:28 AM, Jonathan Gray <jl...@streamy.com> wrote:

> Schubert,
>
> I can't think of any reason your random reads would get slower after
> inserting more data, besides GC issues.
>
> Do you have GC logging and JVM metrics logging turned on?  I would inspect
> those to see if you have any long-running GC pauses, or just lots and lots
> of GC going on.
>
> If I recall, you are running on 4GB nodes, 2GB RS heap, and cohosted
> DataNodes and TaskTrackers.  We ran for a long time on a similar setup, but
> once we moved to 0.20 (and to the CMS garbage collector), we really needed
> to add more memory to the nodes and increase RS heap to 4 or 5GB.  The CMS
> GC is less efficient in memory, but if given sufficient resources, is much
> better for overall performance/throughput.
>
> Also, do you have Ganglia setup?  Are you seeing swapping on your RS nodes?
>  Is there high IO-wait CPU usage?
>
> JG
>
>
> Schubert Zhang wrote:
>
>> Addition.
>> Only random-reads become very slow, scans and sequential-reads are ok.
>>
>>
>> On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>>
>>  stack and J-G, Thank you very much for your helpful comment.
>>>
>>> But now, we find such a critical issue for random reads.
>>> I use sequentical-writes to insert 5GB of data in our HBase table from
>>> empty, and ~30 regions are generated. Then the random-reads takes about
>>> 30
>>> minutes to complete. And then, I run the sequentical-writes again. Thus,
>>> another version of each cell are inserted, thus ~60 regions are
>>> generated.
>>> But, we I ran the random-reads again to this table, it always take long
>>> time
>>> (more than 2 hours).
>>>
>>> I check the heap usage and other metrics, does not find the reason.
>>>
>>> Bellow is the status of one region server:
>>> request=0.0, regions=13, stores=13, storefiles=14, storefileIndexSize=2,
>>> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
>>> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55
>>>
>>> Schubert
>>>
>>>
>>> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>>
>>>  We have just done a Performance Evaluation on HBase-0.20.0.
>>>> Refers to:
>>>>
>>>> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
>>>>
>>>>
>>>
>>

Re: HBase-0.20.0 Performance Evaluation

Posted by Jonathan Gray <jl...@streamy.com>.

Schubert,

I can't think of any reason your random reads would get slower after 
inserting more data, besides GC issues.

Do you have GC logging and JVM metrics logging turned on?  I would 
inspect those to see if you have any long-running GC pauses, or just 
lots and lots of GC going on.

If I recall, you are running on 4GB nodes, 2GB RS heap, and cohosted 
DataNodes and TaskTrackers.  We ran for a long time on a similar setup, 
but once we moved to 0.20 (and to the CMS garbage collector), we really 
needed to add more memory to the nodes and increase RS heap to 4 or 5GB. 
  The CMS GC is less efficient in memory, but if given sufficient 
resources, is much better for overall performance/throughput.

Also, do you have Ganglia setup?  Are you seeing swapping on your RS 
nodes?  Is there high IO-wait CPU usage?

JG

Schubert Zhang wrote:
> Addition.
> Only random-reads become very slow, scans and sequential-reads are ok.
> 
> 
> On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang <zs...@gmail.com> wrote:
> 
>> stack and J-G, Thank you very much for your helpful comment.
>>
>> But now, we find such a critical issue for random reads.
>> I use sequentical-writes to insert 5GB of data in our HBase table from
>> empty, and ~30 regions are generated. Then the random-reads takes about 30
>> minutes to complete. And then, I run the sequentical-writes again. Thus,
>> another version of each cell are inserted, thus ~60 regions are generated.
>> But, we I ran the random-reads again to this table, it always take long time
>> (more than 2 hours).
>>
>> I check the heap usage and other metrics, does not find the reason.
>>
>> Bellow is the status of one region server:
>> request=0.0, regions=13, stores=13, storefiles=14, storefileIndexSize=2,
>> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
>> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55
>>
>> Schubert
>>
>>
>> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <zs...@gmail.com> wrote:
>>
>>> We have just done a Performance Evaluation on HBase-0.20.0.
>>> Refers to:
>>> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
>>>
>>
>

Re: HBase-0.20.0 randomRead

Posted by Jonathan Gray <jl...@streamy.com>.

Yup, what you experienced is a known issue with RC1 that is fixed in RC2.

Murali Krishna. P wrote:
> Hi Jonathan,
>     I am using RC1 and the issue happens when I upload as mapred job. With --nomapred, it worked fine. Currently, I am uploading 100million rows, will definitely upgrade to RC2 once it gets over.
> 
> Thanks,
> Murali Krishna
> 
> 
> 
> 
> ________________________________
> From: Jonathan Gray <jl...@streamy.com>
> To: hbase-user@hadoop.apache.org
> Sent: Wednesday, 19 August, 2009 10:20:18 PM
> Subject: Re: HBase-0.20.0 randomRead
> 
> Murali,
> 
> Which version of HBase are you running?
> 
> There was a fix that was just committed a few days ago for a bug that 
> manifested as null/empty HRI.
> 
> It has been fixed in RC2, so I recommend upgrading to that and trying 
> your upload again.
> 
> JG
> 
> Murali Krishna. P wrote:
>> Thanks for the clarification. I changed the ROW_LENGTH as you suggested and used SequenceWrite + randomRead combination to benchmark. Initial result was impressive, eventhough I would like to have the last column improved.
>>
>> randomRead
>> =========
>>                 -nclients(--rows)               5 (10000)           50(10000)                100(10000)                 1000 (10000)
>> totalrows                      
>> 800k                                             0.4ms                 3.5ms                         6.5ms                            55ms
>> 2.3m                                             0.45ms               3.5ms                         6.6ms                            56ms
>>
>>  Only change in the config was that the handler count increased to 1000. I think there will be some parameters which can be tweaked to improve this further?
>>
>> My goal is get test it for 10million rows with this  box. For some reason the sequenceWrite job with 5000000row + 2 clients failed,with the following exception:-
>>
>> 09/08/19 00:34:07 INFO mapred.LocalJobRunner: 2000000/2050000/2500000
>> 09/08/19 00:50:38 WARN mapred.LocalJobRunner: job_local_0001
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0002076131', but failed after 11 attempts.
>> Exceptions:
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java..io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>> java.io.IOException: HRegionInfo was null or empty in .META.
>>
>>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995)
>>         at org.apache.hadoop.hbase.client..HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1064)
>>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584)
>>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450)
>>         at org.evaluation.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:736)
>>         at org..evaluation.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:571)
>>         at org.evaluation.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:804)
>>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:350)
>>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:326)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
>>
>>
>>  
>>
>> From region server log:-
>> 2009-08-19 00:47:22,740 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@1e458ae5
>> 2009-08-19 00:48:22,741 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@4b28029c
>> 2009-08-19 00:49:22,743 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@33fb11e0
>> 2009-08-19 00:50:22,745 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@6ceccc3b
>> 2009-08-19 00:51:22,746 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@44a3b5b4
>>
>> Thanks,
>> Murali Krishna
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Gray <jl...@streamy.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Wednesday, 19 August, 2009 12:26:55 AM
>> Subject: Re: HBase-0.20.0 randomRead
>>
>> With all that memory, you're likely seeing such good performance because 
>> of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
>> disk read, but since your rows are relatively small, you are loading up 
>> all that data into memory (not only the fs cache, but also hbase's block 
>> cache which makes it even faster).
>>
>> JG
>>
>> Jean-Daniel Cryans wrote:
>>> Well it seems there's something wrong with the way you modified PE. It
>>> is not really testing your table unless the row keys are built the
>>> same way as TestTable is, to me it seems that you are testing on only
>>> 20000 rows so caching is easy. A better test would just be to use PE
>>> the way it currently is but with ROW_LENGTH = 4k.
>>>
>>> WRT Jetty, make sure you optimized it with
>>> http://jetty.mortbay.org/jetty5/doc/optimization.html
>>>
>>> J-D
>>>
>>> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
>>> P<mu...@yahoo.com> wrote:
>>>> Ahh, mistake, I just took it as seconds.
>>>>
>>>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>>>>
>>>>
>>>> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>>>>
>>>> Thanks,
>>>> Murali Krishna
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Jean-Daniel Cryans <jd...@apache.org>
>>>> To: hbase-user@hadoop.apache.org
>>>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>>>> Subject: Re: HBase-0..20.0 randomRead
>>>>
>>>> Murali,
>>>>
>>>> I'm not reading the same thing as you.
>>>>
>>>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>>
>>>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>>>
>>>> J-D
>>>>
>>>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>>>> P<mu...@yahoo.com> wrote:
>>>>> Hi all,
>>>>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>>>>
>>>>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>>>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>>>>
>>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>>>>
>>>>>
>>>>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>>>>
>>>>> The machine config is:
>>>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>>>> Memory:        13.7GB
>>>>> 12 Disks of 1TB each.
>>>>>
>>>>> Let me know if you need anymore details
>>>>>
>>>>> Thanks,
>>>>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by "Murali Krishna. P" <mu...@yahoo.com>.

Hi Jonathan,
    I am using RC1 and the issue happens when I upload as mapred job. With --nomapred, it worked fine. Currently, I am uploading 100million rows, will definitely upgrade to RC2 once it gets over.

Thanks,
Murali Krishna




________________________________
From: Jonathan Gray <jl...@streamy.com>
To: hbase-user@hadoop.apache.org
Sent: Wednesday, 19 August, 2009 10:20:18 PM
Subject: Re: HBase-0.20.0 randomRead

Murali,

Which version of HBase are you running?

There was a fix that was just committed a few days ago for a bug that 
manifested as null/empty HRI.

It has been fixed in RC2, so I recommend upgrading to that and trying 
your upload again.

JG

Murali Krishna. P wrote:
> Thanks for the clarification. I changed the ROW_LENGTH as you suggested and used SequenceWrite + randomRead combination to benchmark. Initial result was impressive, eventhough I would like to have the last column improved.
> 
> randomRead
> =========
>                 -nclients(--rows)               5 (10000)           50(10000)                100(10000)                 1000 (10000)
> totalrows                      
> 800k                                             0.4ms                 3.5ms                         6.5ms                            55ms
> 2.3m                                             0.45ms               3.5ms                         6.6ms                            56ms
> 
>  Only change in the config was that the handler count increased to 1000. I think there will be some parameters which can be tweaked to improve this further?
> 
> My goal is get test it for 10million rows with this  box. For some reason the sequenceWrite job with 5000000row + 2 clients failed,with the following exception:-
> 
> 09/08/19 00:34:07 INFO mapred.LocalJobRunner: 2000000/2050000/2500000
> 09/08/19 00:50:38 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0002076131', but failed after 11 attempts.
> Exceptions:
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java..io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> 
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995)
>         at org.apache.hadoop.hbase.client..HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1064)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450)
>         at org.evaluation.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:736)
>         at org..evaluation.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:571)
>         at org.evaluation.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:804)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:350)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:326)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> 
> 
>  
> 
> From region server log:-
> 2009-08-19 00:47:22,740 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@1e458ae5
> 2009-08-19 00:48:22,741 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@4b28029c
> 2009-08-19 00:49:22,743 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@33fb11e0
> 2009-08-19 00:50:22,745 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@6ceccc3b
> 2009-08-19 00:51:22,746 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@44a3b5b4
> 
> Thanks,
> Murali Krishna
> 
> 
> 
> 
> ________________________________
> From: Jonathan Gray <jl...@streamy.com>
> To: hbase-user@hadoop.apache.org
> Sent: Wednesday, 19 August, 2009 12:26:55 AM
> Subject: Re: HBase-0.20.0 randomRead
> 
> With all that memory, you're likely seeing such good performance because 
> of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
> disk read, but since your rows are relatively small, you are loading up 
> all that data into memory (not only the fs cache, but also hbase's block 
> cache which makes it even faster).
> 
> JG
> 
> Jean-Daniel Cryans wrote:
>> Well it seems there's something wrong with the way you modified PE. It
>> is not really testing your table unless the row keys are built the
>> same way as TestTable is, to me it seems that you are testing on only
>> 20000 rows so caching is easy. A better test would just be to use PE
>> the way it currently is but with ROW_LENGTH = 4k.
>>
>> WRT Jetty, make sure you optimized it with
>> http://jetty.mortbay.org/jetty5/doc/optimization.html
>>
>> J-D
>>
>> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
>> P<mu...@yahoo.com> wrote:
>>> Ahh, mistake, I just took it as seconds.
>>>
>>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>>>
>>>
>>> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>>>
>>> Thanks,
>>> Murali Krishna
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Jean-Daniel Cryans <jd...@apache.org>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>>> Subject: Re: HBase-0..20.0 randomRead
>>>
>>> Murali,
>>>
>>> I'm not reading the same thing as you.
>>>
>>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>
>>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>>
>>> J-D
>>>
>>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>>> P<mu...@yahoo.com> wrote:
>>>> Hi all,
>>>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>>>
>>>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>>>
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>>>
>>>>
>>>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>>>
>>>> The machine config is:
>>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>>> Memory:        13.7GB
>>>> 12 Disks of 1TB each.
>>>>
>>>> Let me know if you need anymore details
>>>>
>>>> Thanks,
>>>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by Jonathan Gray <jl...@streamy.com>.

Murali,

Which version of HBase are you running?

There was a fix that was just committed a few days ago for a bug that 
manifested as null/empty HRI.

It has been fixed in RC2, so I recommend upgrading to that and trying 
your upload again.

JG

Murali Krishna. P wrote:
> Thanks for the clarification. I changed the ROW_LENGTH as you suggested and used SequenceWrite + randomRead combination to benchmark. Initial result was impressive, eventhough I would like to have the last column improved.
> 
> randomRead
> =========
>                 -nclients(--rows)               5 (10000)           50(10000)                100(10000)                 1000 (10000)
> totalrows                       
> 800k                                             0.4ms                 3.5ms                         6.5ms                            55ms
> 2.3m                                             0.45ms               3.5ms                         6.6ms                            56ms
> 
>  Only change in the config was that the handler count increased to 1000. I think there will be some parameters which can be tweaked to improve this further?
> 
> My goal is get test it for 10million rows with this  box. For some reason the sequenceWrite job with 5000000row + 2 clients failed,with the following exception:-
> 
> 09/08/19 00:34:07 INFO mapred.LocalJobRunner: 2000000/2050000/2500000
> 09/08/19 00:50:38 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0002076131', but failed after 11 attempts.
> Exceptions:
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java..io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> 
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1064)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450)
>         at org.evaluation.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:736)
>         at org.evaluation.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:571)
>         at org.evaluation.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:804)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:350)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:326)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> 
> 
>  
> 
> From region server log:-
> 2009-08-19 00:47:22,740 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@1e458ae5
> 2009-08-19 00:48:22,741 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@4b28029c
> 2009-08-19 00:49:22,743 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@33fb11e0
> 2009-08-19 00:50:22,745 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@6ceccc3b
> 2009-08-19 00:51:22,746 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@44a3b5b4
> 
> Thanks,
> Murali Krishna
> 
> 
> 
> 
> ________________________________
> From: Jonathan Gray <jl...@streamy.com>
> To: hbase-user@hadoop.apache.org
> Sent: Wednesday, 19 August, 2009 12:26:55 AM
> Subject: Re: HBase-0.20.0 randomRead
> 
> With all that memory, you're likely seeing such good performance because 
> of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
> disk read, but since your rows are relatively small, you are loading up 
> all that data into memory (not only the fs cache, but also hbase's block 
> cache which makes it even faster).
> 
> JG
> 
> Jean-Daniel Cryans wrote:
>> Well it seems there's something wrong with the way you modified PE. It
>> is not really testing your table unless the row keys are built the
>> same way as TestTable is, to me it seems that you are testing on only
>> 20000 rows so caching is easy. A better test would just be to use PE
>> the way it currently is but with ROW_LENGTH = 4k.
>>
>> WRT Jetty, make sure you optimized it with
>> http://jetty.mortbay.org/jetty5/doc/optimization.html
>>
>> J-D
>>
>> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
>> P<mu...@yahoo.com> wrote:
>>> Ahh, mistake, I just took it as seconds.
>>>
>>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>>>
>>>
>>> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>>>
>>> Thanks,
>>> Murali Krishna
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Jean-Daniel Cryans <jd...@apache.org>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>>> Subject: Re: HBase-0.20.0 randomRead
>>>
>>> Murali,
>>>
>>> I'm not reading the same thing as you.
>>>
>>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>
>>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>>
>>> J-D
>>>
>>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>>> P<mu...@yahoo.com> wrote:
>>>> Hi all,
>>>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>>>
>>>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>>>
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>>>
>>>>
>>>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>>>
>>>> The machine config is:
>>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>>> Memory:        13.7GB
>>>> 12 Disks of 1TB each.
>>>>
>>>> Let me know if you need anymore details
>>>>
>>>> Thanks,
>>>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by "Murali Krishna. P" <mu...@yahoo.com>.

Thanks for the clarification. I changed the ROW_LENGTH as you suggested and used SequenceWrite + randomRead combination to benchmark. Initial result was impressive, eventhough I would like to have the last column improved.

randomRead
=========
                -nclients(--rows)               5 (10000)           50(10000)                100(10000)                 1000 (10000)
totalrows                       
800k                                             0.4ms                 3.5ms                         6.5ms                            55ms
2.3m                                             0.45ms               3.5ms                         6.6ms                            56ms

 Only change in the config was that the handler count increased to 1000. I think there will be some parameters which can be tweaked to improve this further?

My goal is get test it for 10million rows with this  box. For some reason the sequenceWrite job with 5000000row + 2 clients failed,with the following exception:-

09/08/19 00:34:07 INFO mapred.LocalJobRunner: 2000000/2050000/2500000
09/08/19 00:50:38 WARN mapred.LocalJobRunner: job_local_0001
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0002076131', but failed after 11 attempts.
Exceptions:
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java..io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.
java.io.IOException: HRegionInfo was null or empty in .META.

        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1064)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450)
        at org.evaluation.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:736)
        at org.evaluation.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:571)
        at org.evaluation.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:804)
        at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:350)
        at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:326)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)


 

From region server log:-
2009-08-19 00:47:22,740 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@1e458ae5
2009-08-19 00:48:22,741 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@4b28029c
2009-08-19 00:49:22,743 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@33fb11e0
2009-08-19 00:50:22,745 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@6ceccc3b
2009-08-19 00:51:22,746 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@44a3b5b4

Thanks,
Murali Krishna




________________________________
From: Jonathan Gray <jl...@streamy.com>
To: hbase-user@hadoop.apache.org
Sent: Wednesday, 19 August, 2009 12:26:55 AM
Subject: Re: HBase-0.20.0 randomRead

With all that memory, you're likely seeing such good performance because 
of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
disk read, but since your rows are relatively small, you are loading up 
all that data into memory (not only the fs cache, but also hbase's block 
cache which makes it even faster).

JG

Jean-Daniel Cryans wrote:
> Well it seems there's something wrong with the way you modified PE. It
> is not really testing your table unless the row keys are built the
> same way as TestTable is, to me it seems that you are testing on only
> 20000 rows so caching is easy. A better test would just be to use PE
> the way it currently is but with ROW_LENGTH = 4k.
> 
> WRT Jetty, make sure you optimized it with
> http://jetty.mortbay.org/jetty5/doc/optimization.html
> 
> J-D
> 
> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
> P<mu...@yahoo.com> wrote:
>> Ahh, mistake, I just took it as seconds.
>>
>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>>
>>
>> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>>
>> Thanks,
>> Murali Krishna
>>
>>
>>
>>
>> ________________________________
>> From: Jean-Daniel Cryans <jd...@apache.org>
>> To: hbase-user@hadoop.apache.org
>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>> Subject: Re: HBase-0.20.0 randomRead
>>
>> Murali,
>>
>> I'm not reading the same thing as you.
>>
>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>
>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>
>> J-D
>>
>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>> P<mu...@yahoo.com> wrote:
>>> Hi all,
>>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>>
>>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>>
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>>
>>>
>>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>>
>>> The machine config is:
>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>> Memory:        13.7GB
>>> 12 Disks of 1TB each.
>>>
>>> Let me know if you need anymore details
>>>
>>> Thanks,
>>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by Jonathan Gray <jl...@streamy.com>.

With all that memory, you're likely seeing such good performance because 
of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
disk read, but since your rows are relatively small, you are loading up 
all that data into memory (not only the fs cache, but also hbase's block 
cache which makes it even faster).

JG

Jean-Daniel Cryans wrote:
> Well it seems there's something wrong with the way you modified PE. It
> is not really testing your table unless the row keys are built the
> same way as TestTable is, to me it seems that you are testing on only
> 20000 rows so caching is easy. A better test would just be to use PE
> the way it currently is but with ROW_LENGTH = 4k.
> 
> WRT Jetty, make sure you optimized it with
> http://jetty.mortbay.org/jetty5/doc/optimization.html
> 
> J-D
> 
> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
> P<mu...@yahoo.com> wrote:
>> Ahh, mistake, I just took it as seconds.
>>
>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>>
>>
>> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>>
>> Thanks,
>> Murali Krishna
>>
>>
>>
>>
>> ________________________________
>> From: Jean-Daniel Cryans <jd...@apache.org>
>> To: hbase-user@hadoop.apache.org
>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>> Subject: Re: HBase-0.20.0 randomRead
>>
>> Murali,
>>
>> I'm not reading the same thing as you.
>>
>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>
>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>
>> J-D
>>
>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>> P<mu...@yahoo.com> wrote:
>>> Hi all,
>>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>>
>>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>>
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>>
>>>
>>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>>
>>> The machine config is:
>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>> Memory:        13.7GB
>>> 12 Disks of 1TB each.
>>>
>>> Let me know if you need anymore details
>>>
>>> Thanks,
>>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Well it seems there's something wrong with the way you modified PE. It
is not really testing your table unless the row keys are built the
same way as TestTable is, to me it seems that you are testing on only
20000 rows so caching is easy. A better test would just be to use PE
the way it currently is but with ROW_LENGTH = 4k.

WRT Jetty, make sure you optimized it with
http://jetty.mortbay.org/jetty5/doc/optimization.html

J-D

On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
P<mu...@yahoo.com> wrote:
> Ahh, mistake, I just took it as seconds.
>
> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?
>
>
> Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.
>
> Thanks,
> Murali Krishna
>
>
>
>
> ________________________________
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Tuesday, 18 August, 2009 9:13:40 PM
> Subject: Re: HBase-0.20.0 randomRead
>
> Murali,
>
> I'm not reading the same thing as you.
>
> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>
> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>
> J-D
>
> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
> P<mu...@yahoo.com> wrote:
>> Hi all,
>>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>>
>> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>>
>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>>
>>
>> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>>
>> The machine config is:
>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>> Memory:        13.7GB
>> 12 Disks of 1TB each.
>>
>> Let me know if you need anymore details
>>
>> Thanks,
>> Murali Krishna
>

Re: HBase-0.20.0 randomRead

Posted by "Murali Krishna. P" <mu...@yahoo.com>.

Ahh, mistake, I just took it as seconds.

Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone seen these kind of numbers ?


Actually, my initial problem was that I have a jetty infront of this hbase to serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100 clients. That is why decided to benchmark without jetty first.

Thanks,
Murali Krishna




________________________________
From: Jean-Daniel Cryans <jd...@apache.org>
To: hbase-user@hadoop.apache.org
Sent: Tuesday, 18 August, 2009 9:13:40 PM
Subject: Re: HBase-0.20.0 randomRead

Murali,

I'm not reading the same thing as you.

client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows

That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.

J-D

On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
P<mu...@yahoo.com> wrote:
> Hi all,
>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>
> I have an one node setup with hbase-0..20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>
>
> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>
> The machine config is:
> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
> Memory:        13.7GB
> 12 Disks of 1TB each.
>
> Let me know if you need anymore details
>
> Thanks,
> Murali Krishna

Re: HBase-0.20.0 randomRead

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Murali,

I'm not reading the same thing as you.

client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows

That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.

J-D

On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
P<mu...@yahoo.com> wrote:
> Hi all,
>  (Saw a related thread on performance, but starting a different one because my setup is slightly different).
>
> I have an one node setup with hbase-0.20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
> Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.
>
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows
>
>
> So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.
>
> The machine config is:
> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
> Memory:        13.7GB
> 12 Disks of 1TB each.
>
> Let me know if you need anymore details
>
> Thanks,
> Murali Krishna

HBase-0.20.0 randomRead

Posted by "Murali Krishna. P" <mu...@yahoo.com>.

Hi all,
 (Saw a related thread on performance, but starting a different one because my setup is slightly different).

I have an one node setup with hbase-0.20(alpha). It has around 11million rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
Since my primary concern is randomRead, modified the performanceEvaluation code to read from this particular table. The randomRead test gave following result.

09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead in 2813ms at offset 10000 for 10000 rows
09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms writing 10000 rows
09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms writing 10000 rows

 
So, looks like it is taking around 280ms per record. Looking at the latest hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong, since such a hiuge difference :( ? Please help me fix the latency.

The machine config is:
Processors:    2 x Xeon L5420 2.50GHz (8 cores)
Memory:        13.7GB
12 Disks of 1TB each.

Let me know if you need anymore details

Thanks,
Murali Krishna

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

Addition.
Only random-reads become very slow, scans and sequential-reads are ok.


On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang <zs...@gmail.com> wrote:

> stack and J-G, Thank you very much for your helpful comment.
>
> But now, we find such a critical issue for random reads.
> I use sequentical-writes to insert 5GB of data in our HBase table from
> empty, and ~30 regions are generated. Then the random-reads takes about 30
> minutes to complete. And then, I run the sequentical-writes again. Thus,
> another version of each cell are inserted, thus ~60 regions are generated.
> But, we I ran the random-reads again to this table, it always take long time
> (more than 2 hours).
>
> I check the heap usage and other metrics, does not find the reason.
>
> Bellow is the status of one region server:
> request=0.0, regions=13, stores=13, storefiles=14, storefileIndexSize=2,
> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55
>
> Schubert
>
>
> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
>> We have just done a Performance Evaluation on HBase-0.20.0.
>> Refers to:
>> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
>>
>
>

Re: HBase-0.20.0 Performance Evaluation

Posted by Schubert Zhang <zs...@gmail.com>.

stack and J-G, Thank you very much for your helpful comment.

But now, we find such a critical issue for random reads.
I use sequentical-writes to insert 5GB of data in our HBase table from
empty, and ~30 regions are generated. Then the random-reads takes about 30
minutes to complete. And then, I run the sequentical-writes again. Thus,
another version of each cell are inserted, thus ~60 regions are generated.
But, we I ran the random-reads again to this table, it always take long time
(more than 2 hours).

I check the heap usage and other metrics, does not find the reason.

Bellow is the status of one region server:
request=0.0, regions=13, stores=13, storefiles=14, storefileIndexSize=2,
memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55

Schubert

On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <zs...@gmail.com> wrote:

> We have just done a Performance Evaluation on HBase-0.20.0.
> Refers to:
> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
>