You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by ijanitran <ta...@yahoo.com> on 2012/04/06 17:17:36 UTC

Speeding up HBase read response

I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
instance. Target dataset is 100 millions records (each record is 10 fields
by 100 bytes). Benchmarking performed concurrently from parallel 100
threads.

I'm confused with a read latency I got, comparing to what YCSB team achieved
and showed in their YCSB paper. They achieved throughput of up to 7000
ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
are really fast with auto commit disabled (response within a few ms), while
read latency doesn't go lower than 70 ms in average.

These are some HBase settings I used:

    hbase.regionserver.handler.count=50
    hfile.block.cache.size=0.4
    hbase.hregion.max.filesize=1073741824
    hbase.regionserver.codecs=lzo
    hbase.hregion.memstore.mslab.enabled=true
    hfile.min.blocksize.size=16384
    hbase.hregion.memstore.block.multiplier=4
    hbase.regionserver.global.memstore.upperLimit=0.35
    hbase.zookeeper.property.maxClientCnxns=100 

Which settings do you recommend to look at\tune to speed up reads with
HBase?

-- 
View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Speeding up HBase read response

Posted by Michael Segel <mi...@hotmail.com>.

Was the YCSB test also run on Amazon?

Sent from my iPhone

On Apr 6, 2012, at 10:18 AM, "ijanitran" <ta...@yahoo.com> wrote:

> 
> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
> instance. Target dataset is 100 millions records (each record is 10 fields
> by 100 bytes). Benchmarking performed concurrently from parallel 100
> threads.
> 
> I'm confused with a read latency I got, comparing to what YCSB team achieved
> and showed in their YCSB paper. They achieved throughput of up to 7000
> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
> throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
> are really fast with auto commit disabled (response within a few ms), while
> read latency doesn't go lower than 70 ms in average.
> 
> These are some HBase settings I used:
> 
>    hbase.regionserver.handler.count=50
>    hfile.block.cache.size=0.4
>    hbase.hregion.max.filesize=1073741824
>    hbase.regionserver.codecs=lzo
>    hbase.hregion.memstore.mslab.enabled=true
>    hfile.min.blocksize.size=16384
>    hbase.hregion.memstore.block.multiplier=4
>    hbase.regionserver.global.memstore.upperLimit=0.35
>    hbase.zookeeper.property.maxClientCnxns=100 
> 
> Which settings do you recommend to look at\tune to speed up reads with
> HBase?
> 
> -- 
> View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
> Sent from the HBase User mailing list archive at Nabble.com.
>

Re: Speeding up HBase read response

Posted by Michael Segel <mi...@hotmail.com>.

No need for pardon.
I mean its good to hear about the changes to help improve performance. :-)

I just wanted to try and answer the OPs question and set a realistic expectation. 

-Mike

On Apr 12, 2012, at 1:14 AM, Andrew Purtell wrote:

> Pardon yes that is probably true. I hijacked this thread anyway. /eot
> 
> Best regards,
> 
>    - Andy
> 
> 
> On Apr 11, 2012, at 11:04 PM, Michael Segel <mi...@hotmail.com> wrote:
> 
>> Uhm, 
>> Lets take a look back at the original post :
>> "I'm confused with a read latency I got, comparing to what YCSB team achieved
>> and showed in their YCSB paper. They achieved throughput of up to 7000
>> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
>> throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
>> are really fast with auto commit disabled (response within a few ms), while
>> read latency doesn't go lower than 70 ms in average.
>> "
>> While its great to look at how to reduce the read latency, something that you will have to consider that you won't get the same low latency if you have drives local to your node. 
>> 
>> So I have to ask if there's an unrealistic expectation on the part of the OP?
>> 
>> On Apr 12, 2012, at 12:40 AM, Andrew Purtell wrote:
>> 
>>> Hi Otis,
>>> 
>>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon? 
>>> 
>>> Yes.
>>> 
>>> $ ec2-describe-images -a | grep amzn | grep 2012.03.1
>>> 
>>> should give you results. Use this as your base, install Hadoop etc on top. 
>>> 
>>> I used the pv x86_64 variant and tested the direct attached instance store devices. 
>>> 
>>> Unfortunately I'm at the airport now and don't have an instance handy to get you the command output you want.
>>> 
>>> For comparison I launched m1.xlarge instances, our usual for testing, all in the same region of us-west-1. They should be roughly comparable. I ran each test three times each with a new instance and warmed up the instance devices with a preliminary FIO run. 
>>> 
>>> As you know EC2 isn't really good for performance benchmarking, the variability is quite high. However I did take the basic steps above to try and get a useful (albeit unscientific) result. 
>>> 
>>> It would be interesting if someone else finds similar results, or not, as the case may be. 
>>> 
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> 
>>> On Apr 11, 2012, at 2:31 PM, Otis Gospodnetic <ot...@yahoo.com> wrote:
>>> 
>>>> Hi Andy,
>>>> 
>>>> This email must have caught attention of a number of people...
>>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera prepared?  Or are you saying it's just "some Linux" AMI that somebody built on 2012-03-01 and that you found in AWS?
>>>> 
>>>> Could you please share the outputs of:
>>>> 
>>>> $ cat /etc/*release
>>>> $ uname -a
>>>> 
>>>> $ df -T
>>>> 
>>>> Also, could it be that your old EC2 instance was unlucky and had a very noisy neighbour, while the new EC2 instance does not?  Not sure how one could run tests to get around this - perhaps by terminating the instance and restarting it a few times in order to get it hosted on different physical hosts?
>>>> 
>>>> Thanks,
>>>> Otis 
>>>> ----
>>>> Performance Monitoring SaaS for HBase - http://sematext.com/spm/hbase-performance-monitoring/index.html
>>>> 
>>>> 
>>>> 
>>>>> ________________________________
>>>>> From: Andrew Purtell <ap...@apache.org>
>>>>> To: "user@hbase.apache.org" <us...@hbase.apache.org> 
>>>>> Cc: Jack Levin <ma...@gmail.com>; "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org> 
>>>>> Sent: Tuesday, April 10, 2012 2:14 PM
>>>>> Subject: Re: Speeding up HBase read response
>>>>> 
>>>>> What AMI are you using as your base?
>>>>> 
>>>>> I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.
>>>>> 
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>>  - Andy
>>>>> 
>>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>>>> 
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>>> From: Jeff Whiting <je...@qualtrics.com>
>>>>>> To: user@hbase.apache.org
>>>>>> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
>>>>>> Sent: Tuesday, April 10, 2012 11:03 AM
>>>>>> Subject: Re: Speeding up HBase read response
>>>>>> 
>>>>>> Do you have bloom filters enabled?  And compression?  Both of those can help 
>>>>>> reduce disk io load 
>>>>>> which seems to be the main issue you are having on the ec2 cluster.
>>>>>> 
>>>>>> ~Jeff
>>>>>> 
>>>>>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>>>>> Yes, from  %util you can see that your disks are working at 100%
>>>>>>> pretty much.  Which means you can't push them go any faster.   So the
>>>>>>> solution is to add more disks, add faster disks, add nodes and disks.
>>>>>>> This type of overload should not be related to HBASE, but rather to
>>>>>>> your hardware setup.
>>>>>>> 
>>>>>>> -Jack
>>>>>>> 
>>>>>>> On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>>>>>> Hi, results of iostat are pretty much very similar on all nodes:
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>>>>>> 64.54
>>>>>>>> 21.97   75.44   3.40 100.10
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>>>>>> 65.33
>>>>>>>> 7.16 25.32 2.88  84.70
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>>>>>> 59.99
>>>>>>>> 10.31   35.43   2.97  84.10
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>>>>>> 58.38
>>>>>>>> 12.32   39.56   2.79  89.40
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>>>>>> 55.84
>>>>>>>> 10.67   31.42   2.78  93.47
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>>>>>> 65.62
>>>>>>>> 11.07   35.49   2.91  90.70
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>>>>>> 61.66
>>>>>>>> 9.38 26.63 2.57  91.40
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>>>>>> 65.05
>>>>>>>> 13.37   51.24   3.64  93.90
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>>>>>> 60.88
>>>>>>>> 5.87   24.53   3.14  77.30
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>>>>>> 62.29
>>>>>>>> 13.02   42.40   3.12  93.60
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>>>>>> 67.32
>>>>>>>> 11.30   39.51   3.36  98.00
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>>>>>> 61.74
>>>>>>>> 16.07   55.72   3.39  91.30
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Jack Levin wrote:
>>>>>>>>> Please email iostat -xdm 1, run for one minute during load on each 
>>>>>> node
>>>>>>>>> --
>>>>>>>>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>>>>>> 
>>>>>>>>> ijanitran<ta...@yahoo.com> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>>>>>> XLarge
>>>>>>>>> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>>>>>> HRegion
>>>>>>>>> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>>>>>> separate XLarge
>>>>>>>>> instance. Target dataset is 100 millions records (each record is 10 
>>>>>> fields
>>>>>>>>> by 100 bytes). Benchmarking performed concurrently from parallel 
>>>>>> 100
>>>>>>>>> threads.
>>>>>>>>> 
>>>>>>>>> I'm confused with a read latency I got, comparing to what YCSB 
>>>>>> team
>>>>>>>>> achieved
>>>>>>>>> and showed in their YCSB paper. They achieved throughput of up to 
>>>>>> 7000
>>>>>>>>> ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>>>>>> can't get
>>>>>>>>> throughput higher than 2000 ops/sec on 90% reads/10% writes 
>>>>>> workload.
>>>>>>>>> Writes
>>>>>>>>> are really fast with auto commit disabled (response within a few 
>>>>>> ms),
>>>>>>>>> while
>>>>>>>>> read latency doesn't go lower than 70 ms in average.
>>>>>>>>> 
>>>>>>>>> These are some HBase settings I used:
>>>>>>>>> 
>>>>>>>>> hbase.regionserver.handler.count=50
>>>>>>>>> hfile.block.cache.size=0.4
>>>>>>>>> hbase.hregion.max.filesize=1073741824
>>>>>>>>> hbase.regionserver.codecs=lzo
>>>>>>>>> hbase.hregion.memstore.mslab.enabled=true
>>>>>>>>> hfile.min.blocksize.size=16384
>>>>>>>>> hbase.hregion.memstore.block.multiplier=4
>>>>>>>>> hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>>>>> hbase.zookeeper.property.maxClientCnxns=100
>>>>>>>>> 
>>>>>>>>> Which settings do you recommend to look at\tune to speed up 
>>>>>> reads with
>>>>>>>>> HBase?
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> View this message in context:
>>>>>>>>> 
>>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> --
>>>>>>>> View this message in context: 
>>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Jeff Whiting
>>>>>> Qualtrics Senior Software Engineer
>>>>>> jeffw@qualtrics.com
>>>>>> 
>>>>> 
>>>>> 
>>> 
>> 
>

Re: Speeding up HBase read response

Posted by Andrew Purtell <ap...@yahoo.com>.

Pardon yes that is probably true. I hijacked this thread anyway. /eot

Best regards,

    - Andy


On Apr 11, 2012, at 11:04 PM, Michael Segel <mi...@hotmail.com> wrote:

> Uhm, 
> Lets take a look back at the original post :
> "I'm confused with a read latency I got, comparing to what YCSB team achieved
> and showed in their YCSB paper. They achieved throughput of up to 7000
> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
> throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
> are really fast with auto commit disabled (response within a few ms), while
> read latency doesn't go lower than 70 ms in average.
> "
> While its great to look at how to reduce the read latency, something that you will have to consider that you won't get the same low latency if you have drives local to your node. 
> 
> So I have to ask if there's an unrealistic expectation on the part of the OP?
> 
> On Apr 12, 2012, at 12:40 AM, Andrew Purtell wrote:
> 
>> Hi Otis,
>> 
>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon? 
>> 
>> Yes.
>> 
>> $ ec2-describe-images -a | grep amzn | grep 2012.03.1
>> 
>> should give you results. Use this as your base, install Hadoop etc on top. 
>> 
>> I used the pv x86_64 variant and tested the direct attached instance store devices. 
>> 
>> Unfortunately I'm at the airport now and don't have an instance handy to get you the command output you want.
>> 
>> For comparison I launched m1.xlarge instances, our usual for testing, all in the same region of us-west-1. They should be roughly comparable. I ran each test three times each with a new instance and warmed up the instance devices with a preliminary FIO run. 
>> 
>> As you know EC2 isn't really good for performance benchmarking, the variability is quite high. However I did take the basic steps above to try and get a useful (albeit unscientific) result. 
>> 
>> It would be interesting if someone else finds similar results, or not, as the case may be. 
>> 
>> Best regards,
>> 
>>   - Andy
>> 
>> 
>> On Apr 11, 2012, at 2:31 PM, Otis Gospodnetic <ot...@yahoo.com> wrote:
>> 
>>> Hi Andy,
>>> 
>>> This email must have caught attention of a number of people...
>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera prepared?  Or are you saying it's just "some Linux" AMI that somebody built on 2012-03-01 and that you found in AWS?
>>> 
>>> Could you please share the outputs of:
>>> 
>>> $ cat /etc/*release
>>> $ uname -a
>>> 
>>> $ df -T
>>> 
>>> Also, could it be that your old EC2 instance was unlucky and had a very noisy neighbour, while the new EC2 instance does not?  Not sure how one could run tests to get around this - perhaps by terminating the instance and restarting it a few times in order to get it hosted on different physical hosts?
>>> 
>>> Thanks,
>>> Otis 
>>> ----
>>> Performance Monitoring SaaS for HBase - http://sematext.com/spm/hbase-performance-monitoring/index.html
>>> 
>>> 
>>> 
>>>> ________________________________
>>>> From: Andrew Purtell <ap...@apache.org>
>>>> To: "user@hbase.apache.org" <us...@hbase.apache.org> 
>>>> Cc: Jack Levin <ma...@gmail.com>; "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org> 
>>>> Sent: Tuesday, April 10, 2012 2:14 PM
>>>> Subject: Re: Speeding up HBase read response
>>>> 
>>>> What AMI are you using as your base?
>>>> 
>>>> I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.
>>>> 
>>>> 
>>>> Best regards,
>>>> 
>>>>   - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>>> 
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>>> From: Jeff Whiting <je...@qualtrics.com>
>>>>> To: user@hbase.apache.org
>>>>> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
>>>>> Sent: Tuesday, April 10, 2012 11:03 AM
>>>>> Subject: Re: Speeding up HBase read response
>>>>> 
>>>>> Do you have bloom filters enabled?  And compression?  Both of those can help 
>>>>> reduce disk io load 
>>>>> which seems to be the main issue you are having on the ec2 cluster.
>>>>> 
>>>>> ~Jeff
>>>>> 
>>>>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>>>> Yes, from  %util you can see that your disks are working at 100%
>>>>>> pretty much.  Which means you can't push them go any faster.   So the
>>>>>> solution is to add more disks, add faster disks, add nodes and disks.
>>>>>> This type of overload should not be related to HBASE, but rather to
>>>>>> your hardware setup.
>>>>>> 
>>>>>> -Jack
>>>>>> 
>>>>>> On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>>>>> Hi, results of iostat are pretty much very similar on all nodes:
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>>>>> 64.54
>>>>>>> 21.97   75.44   3.40 100.10
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>>>>> 65.33
>>>>>>> 7.16 25.32 2.88  84.70
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>>>>> 59.99
>>>>>>> 10.31   35.43   2.97  84.10
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>>>>> 58.38
>>>>>>> 12.32   39.56   2.79  89.40
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>>>>> 55.84
>>>>>>> 10.67   31.42   2.78  93.47
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>>>>> 65.62
>>>>>>> 11.07   35.49   2.91  90.70
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>>>>> 61.66
>>>>>>> 9.38 26.63 2.57  91.40
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>>>>> 65.05
>>>>>>> 13.37   51.24   3.64  93.90
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>>>>> 60.88
>>>>>>> 5.87   24.53   3.14  77.30
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>>>>> 62.29
>>>>>>> 13.02   42.40   3.12  93.60
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>>>>> 67.32
>>>>>>> 11.30   39.51   3.36  98.00
>>>>>>> 
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>> avgrq-sz
>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>> xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>>>>> 61.74
>>>>>>> 16.07   55.72   3.39  91.30
>>>>>>> 
>>>>>>> 
>>>>>>> Jack Levin wrote:
>>>>>>>> Please email iostat -xdm 1, run for one minute during load on each 
>>>>> node
>>>>>>>> --
>>>>>>>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>>>>> 
>>>>>>>> ijanitran<ta...@yahoo.com> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>>>>> XLarge
>>>>>>>> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>>>>> HRegion
>>>>>>>> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>>>>> separate XLarge
>>>>>>>> instance. Target dataset is 100 millions records (each record is 10 
>>>>> fields
>>>>>>>> by 100 bytes). Benchmarking performed concurrently from parallel 
>>>>> 100
>>>>>>>> threads.
>>>>>>>> 
>>>>>>>> I'm confused with a read latency I got, comparing to what YCSB 
>>>>> team
>>>>>>>> achieved
>>>>>>>> and showed in their YCSB paper. They achieved throughput of up to 
>>>>> 7000
>>>>>>>> ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>>>>> can't get
>>>>>>>> throughput higher than 2000 ops/sec on 90% reads/10% writes 
>>>>> workload.
>>>>>>>> Writes
>>>>>>>> are really fast with auto commit disabled (response within a few 
>>>>> ms),
>>>>>>>> while
>>>>>>>> read latency doesn't go lower than 70 ms in average.
>>>>>>>> 
>>>>>>>> These are some HBase settings I used:
>>>>>>>> 
>>>>>>>> hbase.regionserver.handler.count=50
>>>>>>>> hfile.block.cache.size=0.4
>>>>>>>> hbase.hregion.max.filesize=1073741824
>>>>>>>> hbase.regionserver.codecs=lzo
>>>>>>>> hbase.hregion.memstore.mslab.enabled=true
>>>>>>>> hfile.min.blocksize.size=16384
>>>>>>>> hbase.hregion.memstore.block.multiplier=4
>>>>>>>> hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>>>> hbase.zookeeper.property.maxClientCnxns=100
>>>>>>>> 
>>>>>>>> Which settings do you recommend to look at\tune to speed up 
>>>>> reads with
>>>>>>>> HBase?
>>>>>>>> 
>>>>>>>> --
>>>>>>>> View this message in context:
>>>>>>>> 
>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> --
>>>>>>> View this message in context: 
>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>> 
>>>>> 
>>>>> -- 
>>>>> Jeff Whiting
>>>>> Qualtrics Senior Software Engineer
>>>>> jeffw@qualtrics.com
>>>>> 
>>>> 
>>>> 
>> 
>

Re: Speeding up HBase read response

Posted by Michael Segel <mi...@hotmail.com>.

Uhm, 
Lets take a look back at the original post :
"I'm confused with a read latency I got, comparing to what YCSB team achieved
and showed in their YCSB paper. They achieved throughput of up to 7000
ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
are really fast with auto commit disabled (response within a few ms), while
read latency doesn't go lower than 70 ms in average.
"
While its great to look at how to reduce the read latency, something that you will have to consider that you won't get the same low latency if you have drives local to your node. 

So I have to ask if there's an unrealistic expectation on the part of the OP?

On Apr 12, 2012, at 12:40 AM, Andrew Purtell wrote:

> Hi Otis,
> 
>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon? 
> 
> Yes.
> 
> $ ec2-describe-images -a | grep amzn | grep 2012.03.1
> 
> should give you results. Use this as your base, install Hadoop etc on top. 
> 
> I used the pv x86_64 variant and tested the direct attached instance store devices. 
> 
> Unfortunately I'm at the airport now and don't have an instance handy to get you the command output you want.
> 
> For comparison I launched m1.xlarge instances, our usual for testing, all in the same region of us-west-1. They should be roughly comparable. I ran each test three times each with a new instance and warmed up the instance devices with a preliminary FIO run. 
> 
> As you know EC2 isn't really good for performance benchmarking, the variability is quite high. However I did take the basic steps above to try and get a useful (albeit unscientific) result. 
> 
> It would be interesting if someone else finds similar results, or not, as the case may be. 
> 
> Best regards,
> 
>    - Andy
> 
> 
> On Apr 11, 2012, at 2:31 PM, Otis Gospodnetic <ot...@yahoo.com> wrote:
> 
>> Hi Andy,
>> 
>> This email must have caught attention of a number of people...
>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera prepared?  Or are you saying it's just "some Linux" AMI that somebody built on 2012-03-01 and that you found in AWS?
>> 
>> Could you please share the outputs of:
>> 
>> $ cat /etc/*release
>> $ uname -a
>> 
>> $ df -T
>> 
>> Also, could it be that your old EC2 instance was unlucky and had a very noisy neighbour, while the new EC2 instance does not?  Not sure how one could run tests to get around this - perhaps by terminating the instance and restarting it a few times in order to get it hosted on different physical hosts?
>> 
>> Thanks,
>> Otis 
>> ----
>> Performance Monitoring SaaS for HBase - http://sematext.com/spm/hbase-performance-monitoring/index.html
>> 
>> 
>> 
>>> ________________________________
>>> From: Andrew Purtell <ap...@apache.org>
>>> To: "user@hbase.apache.org" <us...@hbase.apache.org> 
>>> Cc: Jack Levin <ma...@gmail.com>; "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org> 
>>> Sent: Tuesday, April 10, 2012 2:14 PM
>>> Subject: Re: Speeding up HBase read response
>>> 
>>> What AMI are you using as your base?
>>> 
>>> I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.
>>> 
>>> 
>>> Best regards,
>>> 
>>>    - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: Jeff Whiting <je...@qualtrics.com>
>>>> To: user@hbase.apache.org
>>>> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
>>>> Sent: Tuesday, April 10, 2012 11:03 AM
>>>> Subject: Re: Speeding up HBase read response
>>>> 
>>>> Do you have bloom filters enabled?  And compression?  Both of those can help 
>>>> reduce disk io load 
>>>> which seems to be the main issue you are having on the ec2 cluster.
>>>> 
>>>> ~Jeff
>>>> 
>>>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>>>  Yes, from  %util you can see that your disks are working at 100%
>>>>>  pretty much.  Which means you can't push them go any faster.   So the
>>>>>  solution is to add more disks, add faster disks, add nodes and disks.
>>>>>  This type of overload should not be related to HBASE, but rather to
>>>>>  your hardware setup.
>>>>> 
>>>>>  -Jack
>>>>> 
>>>>>  On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>>>>  Hi, results of iostat are pretty much very similar on all nodes:
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>>>> 64.54
>>>>>>  21.97   75.44   3.40 100.10
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>>>> 65.33
>>>>>>  7.16 25.32 2.88  84.70
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>>>> 59.99
>>>>>>  10.31   35.43   2.97  84.10
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>>>> 58.38
>>>>>>  12.32   39.56   2.79  89.40
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>>>> 55.84
>>>>>>  10.67   31.42   2.78  93.47
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>>>> 65.62
>>>>>>  11.07   35.49   2.91  90.70
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>>>> 61.66
>>>>>>  9.38 26.63 2.57  91.40
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>>>> 65.05
>>>>>>  13.37   51.24   3.64  93.90
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>>>> 60.88
>>>>>>  5.87   24.53   3.14  77.30
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>>>> 62.29
>>>>>>  13.02   42.40   3.12  93.60
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>>>> 67.32
>>>>>>  11.30   39.51   3.36  98.00
>>>>>> 
>>>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>> avgrq-sz
>>>>>>  avgqu-sz   await  svctm  %util
>>>>>>  xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>>>> 61.74
>>>>>>  16.07   55.72   3.39  91.30
>>>>>> 
>>>>>> 
>>>>>>  Jack Levin wrote:
>>>>>>>  Please email iostat -xdm 1, run for one minute during load on each 
>>>> node
>>>>>>>  --
>>>>>>>  Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>>>> 
>>>>>>>  ijanitran<ta...@yahoo.com>  wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>  I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>>>> XLarge
>>>>>>>  instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>>>> HRegion
>>>>>>>  servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>>>> separate XLarge
>>>>>>>  instance. Target dataset is 100 millions records (each record is 10 
>>>> fields
>>>>>>>  by 100 bytes). Benchmarking performed concurrently from parallel 
>>>> 100
>>>>>>>  threads.
>>>>>>> 
>>>>>>>  I'm confused with a read latency I got, comparing to what YCSB 
>>>> team
>>>>>>>  achieved
>>>>>>>  and showed in their YCSB paper. They achieved throughput of up to 
>>>> 7000
>>>>>>>  ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>>>> can't get
>>>>>>>  throughput higher than 2000 ops/sec on 90% reads/10% writes 
>>>> workload.
>>>>>>>  Writes
>>>>>>>  are really fast with auto commit disabled (response within a few 
>>>> ms),
>>>>>>>  while
>>>>>>>  read latency doesn't go lower than 70 ms in average.
>>>>>>> 
>>>>>>>  These are some HBase settings I used:
>>>>>>> 
>>>>>>>  hbase.regionserver.handler.count=50
>>>>>>>  hfile.block.cache.size=0.4
>>>>>>>  hbase.hregion.max.filesize=1073741824
>>>>>>>  hbase.regionserver.codecs=lzo
>>>>>>>  hbase.hregion.memstore.mslab.enabled=true
>>>>>>>  hfile.min.blocksize.size=16384
>>>>>>>  hbase.hregion.memstore.block.multiplier=4
>>>>>>>  hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>>>  hbase.zookeeper.property.maxClientCnxns=100
>>>>>>> 
>>>>>>>  Which settings do you recommend to look at\tune to speed up 
>>>> reads with
>>>>>>>  HBase?
>>>>>>> 
>>>>>>>  --
>>>>>>>  View this message in context:
>>>>>>> 
>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>  --
>>>>>>  View this message in context: 
>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>>>>> 
>>>> 
>>>> -- 
>>>> Jeff Whiting
>>>> Qualtrics Senior Software Engineer
>>>> jeffw@qualtrics.com
>>>> 
>>> 
>>> 
>

Re: Speeding up HBase read response

Posted by Andrew Purtell <ap...@yahoo.com>.

Hi Otis,

> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon? 

Yes.

$ ec2-describe-images -a | grep amzn | grep 2012.03.1

should give you results. Use this as your base, install Hadoop etc on top. 

I used the pv x86_64 variant and tested the direct attached instance store devices. 

Unfortunately I'm at the airport now and don't have an instance handy to get you the command output you want.

For comparison I launched m1.xlarge instances, our usual for testing, all in the same region of us-west-1. They should be roughly comparable. I ran each test three times each with a new instance and warmed up the instance devices with a preliminary FIO run. 

As you know EC2 isn't really good for performance benchmarking, the variability is quite high. However I did take the basic steps above to try and get a useful (albeit unscientific) result. 

It would be interesting if someone else finds similar results, or not, as the case may be. 

Best regards,

    - Andy


On Apr 11, 2012, at 2:31 PM, Otis Gospodnetic <ot...@yahoo.com> wrote:

> Hi Andy,
> 
> This email must have caught attention of a number of people...
> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera prepared?  Or are you saying it's just "some Linux" AMI that somebody built on 2012-03-01 and that you found in AWS?
> 
> Could you please share the outputs of:
> 
> $ cat /etc/*release
> $ uname -a
> 
> $ df -T
> 
> Also, could it be that your old EC2 instance was unlucky and had a very noisy neighbour, while the new EC2 instance does not?  Not sure how one could run tests to get around this - perhaps by terminating the instance and restarting it a few times in order to get it hosted on different physical hosts?
> 
> Thanks,
> Otis 
> ----
> Performance Monitoring SaaS for HBase - http://sematext.com/spm/hbase-performance-monitoring/index.html
> 
> 
> 
>> ________________________________
>> From: Andrew Purtell <ap...@apache.org>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org> 
>> Cc: Jack Levin <ma...@gmail.com>; "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org> 
>> Sent: Tuesday, April 10, 2012 2:14 PM
>> Subject: Re: Speeding up HBase read response
>> 
>> What AMI are you using as your base?
>> 
>> I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.
>> 
>> 
>> Best regards,
>> 
>>     - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>> 
>> 
>> 
>> ----- Original Message -----
>>> From: Jeff Whiting <je...@qualtrics.com>
>>> To: user@hbase.apache.org
>>> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
>>> Sent: Tuesday, April 10, 2012 11:03 AM
>>> Subject: Re: Speeding up HBase read response
>>> 
>>> Do you have bloom filters enabled?  And compression?  Both of those can help 
>>> reduce disk io load 
>>> which seems to be the main issue you are having on the ec2 cluster.
>>> 
>>> ~Jeff
>>> 
>>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>>   Yes, from  %util you can see that your disks are working at 100%
>>>>   pretty much.  Which means you can't push them go any faster.   So the
>>>>   solution is to add more disks, add faster disks, add nodes and disks.
>>>>   This type of overload should not be related to HBASE, but rather to
>>>>   your hardware setup.
>>>> 
>>>>   -Jack
>>>> 
>>>>   On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>>>   Hi, results of iostat are pretty much very similar on all nodes:
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>>> 64.54
>>>>>   21.97   75.44   3.40 100.10
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>>> 65.33
>>>>>   7.16 25.32 2.88  84.70
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>>> 59.99
>>>>>   10.31   35.43   2.97  84.10
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>>> 58.38
>>>>>   12.32   39.56   2.79  89.40
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>>> 55.84
>>>>>   10.67   31.42   2.78  93.47
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>>> 65.62
>>>>>   11.07   35.49   2.91  90.70
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>>> 61.66
>>>>>   9.38 26.63 2.57  91.40
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>>> 65.05
>>>>>   13.37   51.24   3.64  93.90
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>>> 60.88
>>>>>   5.87   24.53   3.14  77.30
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>>> 62.29
>>>>>   13.02   42.40   3.12  93.60
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>>> 67.32
>>>>>   11.30   39.51   3.36  98.00
>>>>> 
>>>>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>> avgrq-sz
>>>>>   avgqu-sz   await  svctm  %util
>>>>>   xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>>> 61.74
>>>>>   16.07   55.72   3.39  91.30
>>>>> 
>>>>> 
>>>>>   Jack Levin wrote:
>>>>>>   Please email iostat -xdm 1, run for one minute during load on each 
>>> node
>>>>>>   --
>>>>>>   Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>>> 
>>>>>>   ijanitran<ta...@yahoo.com>  wrote:
>>>>>> 
>>>>>> 
>>>>>>   I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>>> XLarge
>>>>>>   instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>>> HRegion
>>>>>>   servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>>> separate XLarge
>>>>>>   instance. Target dataset is 100 millions records (each record is 10 
>>> fields
>>>>>>   by 100 bytes). Benchmarking performed concurrently from parallel 
>>> 100
>>>>>>   threads.
>>>>>> 
>>>>>>   I'm confused with a read latency I got, comparing to what YCSB 
>>> team
>>>>>>   achieved
>>>>>>   and showed in their YCSB paper. They achieved throughput of up to 
>>> 7000
>>>>>>   ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>>> can't get
>>>>>>   throughput higher than 2000 ops/sec on 90% reads/10% writes 
>>> workload.
>>>>>>   Writes
>>>>>>   are really fast with auto commit disabled (response within a few 
>>> ms),
>>>>>>   while
>>>>>>   read latency doesn't go lower than 70 ms in average.
>>>>>> 
>>>>>>   These are some HBase settings I used:
>>>>>> 
>>>>>>   hbase.regionserver.handler.count=50
>>>>>>   hfile.block.cache.size=0.4
>>>>>>   hbase.hregion.max.filesize=1073741824
>>>>>>   hbase.regionserver.codecs=lzo
>>>>>>   hbase.hregion.memstore.mslab.enabled=true
>>>>>>   hfile.min.blocksize.size=16384
>>>>>>   hbase.hregion.memstore.block.multiplier=4
>>>>>>   hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>>   hbase.zookeeper.property.maxClientCnxns=100
>>>>>> 
>>>>>>   Which settings do you recommend to look at\tune to speed up 
>>> reads with
>>>>>>   HBase?
>>>>>> 
>>>>>>   --
>>>>>>   View this message in context:
>>>>>> 
>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>>   Sent from the HBase User mailing list archive at Nabble.com.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>   --
>>>>>   View this message in context: 
>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>>   Sent from the HBase User mailing list archive at Nabble.com.
>>>>> 
>>> 
>>> -- 
>>> Jeff Whiting
>>> Qualtrics Senior Software Engineer
>>> jeffw@qualtrics.com
>>> 
>> 
>>

Re: Speeding up HBase read response

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hi Andy,

This email must have caught attention of a number of people...
You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera prepared?  Or are you saying it's just "some Linux" AMI that somebody built on 2012-03-01 and that you found in AWS?

Could you please share the outputs of:

$ cat /etc/*release
$ uname -a

$ df -T

Also, could it be that your old EC2 instance was unlucky and had a very noisy neighbour, while the new EC2 instance does not?  Not sure how one could run tests to get around this - perhaps by terminating the instance and restarting it a few times in order to get it hosted on different physical hosts?

Thanks,
Otis 
----
Performance Monitoring SaaS for HBase - http://sematext.com/spm/hbase-performance-monitoring/index.html



>________________________________
> From: Andrew Purtell <ap...@apache.org>
>To: "user@hbase.apache.org" <us...@hbase.apache.org> 
>Cc: Jack Levin <ma...@gmail.com>; "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org> 
>Sent: Tuesday, April 10, 2012 2:14 PM
>Subject: Re: Speeding up HBase read response
> 
>What AMI are you using as your base?
>
>I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.
>
>
>Best regards,
>
>    - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>
>----- Original Message -----
>> From: Jeff Whiting <je...@qualtrics.com>
>> To: user@hbase.apache.org
>> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
>> Sent: Tuesday, April 10, 2012 11:03 AM
>> Subject: Re: Speeding up HBase read response
>> 
>> Do you have bloom filters enabled?  And compression?  Both of those can help 
>> reduce disk io load 
>> which seems to be the main issue you are having on the ec2 cluster.
>> 
>> ~Jeff
>> 
>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>  Yes, from  %util you can see that your disks are working at 100%
>>>  pretty much.  Which means you can't push them go any faster.   So the
>>>  solution is to add more disks, add faster disks, add nodes and disks.
>>>  This type of overload should not be related to HBASE, but rather to
>>>  your hardware setup.
>>> 
>>>  -Jack
>>> 
>>>  On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>>  Hi, results of iostat are pretty much very similar on all nodes:
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>> 64.54
>>>>  21.97   75.44   3.40 100.10
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>> 65.33
>>>>  7.16 25.32 2.88  84.70
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>> 59.99
>>>>  10.31   35.43   2.97  84.10
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>> 58.38
>>>>  12.32   39.56   2.79  89.40
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>> 55.84
>>>>  10.67   31.42   2.78  93.47
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>> 65.62
>>>>  11.07   35.49   2.91  90.70
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>> 61.66
>>>>  9.38 26.63 2.57  91.40
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>> 65.05
>>>>  13.37   51.24   3.64  93.90
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>> 60.88
>>>>  5.87   24.53   3.14  77.30
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>> 62.29
>>>>  13.02   42.40   3.12  93.60
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>> 67.32
>>>>  11.30   39.51   3.36  98.00
>>>> 
>>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>> avgrq-sz
>>>>  avgqu-sz   await  svctm  %util
>>>>  xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>> 61.74
>>>>  16.07   55.72   3.39  91.30
>>>> 
>>>> 
>>>>  Jack Levin wrote:
>>>>>  Please email iostat -xdm 1, run for one minute during load on each 
>> node
>>>>>  --
>>>>>  Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>> 
>>>>>  ijanitran<ta...@yahoo.com>  wrote:
>>>>> 
>>>>> 
>>>>>  I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>> XLarge
>>>>>  instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>> HRegion
>>>>>  servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>> separate XLarge
>>>>>  instance. Target dataset is 100 millions records (each record is 10 
>> fields
>>>>>  by 100 bytes). Benchmarking performed concurrently from parallel 
>> 100
>>>>>  threads.
>>>>> 
>>>>>  I'm confused with a read latency I got, comparing to what YCSB 
>> team
>>>>>  achieved
>>>>>  and showed in their YCSB paper. They achieved throughput of up to 
>> 7000
>>>>>  ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>> can't get
>>>>>  throughput higher than 2000 ops/sec on 90% reads/10% writes 
>> workload.
>>>>>  Writes
>>>>>  are really fast with auto commit disabled (response within a few 
>> ms),
>>>>>  while
>>>>>  read latency doesn't go lower than 70 ms in average.
>>>>> 
>>>>>  These are some HBase settings I used:
>>>>> 
>>>>>  hbase.regionserver.handler.count=50
>>>>>  hfile.block.cache.size=0.4
>>>>>  hbase.hregion.max.filesize=1073741824
>>>>>  hbase.regionserver.codecs=lzo
>>>>>  hbase.hregion.memstore.mslab.enabled=true
>>>>>  hfile.min.blocksize.size=16384
>>>>>  hbase.hregion.memstore.block.multiplier=4
>>>>>  hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>  hbase.zookeeper.property.maxClientCnxns=100
>>>>> 
>>>>>  Which settings do you recommend to look at\tune to speed up 
>> reads with
>>>>>  HBase?
>>>>> 
>>>>>  --
>>>>>  View this message in context:
>>>>> 
>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> 
>>>>  --
>>>>  View this message in context: 
>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>>> 
>> 
>> -- 
>> Jeff Whiting
>> Qualtrics Senior Software Engineer
>> jeffw@qualtrics.com
>>
>
>
>

Re: Speeding up HBase read response

Posted by Andrew Purtell <ap...@apache.org>.

What AMI are you using as your base?

I recently started using the new Linux AMI (2012.03.1) and noticed what looks like significant improvement over what I had been using before (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a read bandwidth test, a write bandwidth test, a read IOPS test, and a write IOPS test. The write IOPS test was inconclusive but for the others there was a consistent difference: reduced disk op latency (shorter tail) and increased device bandwidth. I don't run anything in production in EC2 so this was the extent of my curiosity.


Best regards,

    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)



----- Original Message -----
> From: Jeff Whiting <je...@qualtrics.com>
> To: user@hbase.apache.org
> Cc: Jack Levin <ma...@gmail.com>; hbase-user@hadoop.apache.org
> Sent: Tuesday, April 10, 2012 11:03 AM
> Subject: Re: Speeding up HBase read response
> 
> Do you have bloom filters enabled?  And compression?  Both of those can help 
> reduce disk io load 
> which seems to be the main issue you are having on the ec2 cluster.
> 
> ~Jeff
> 
> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>  Yes, from  %util you can see that your disks are working at 100%
>>  pretty much.  Which means you can't push them go any faster.   So the
>>  solution is to add more disks, add faster disks, add nodes and disks.
>>  This type of overload should not be related to HBASE, but rather to
>>  your hardware setup.
>> 
>>  -Jack
>> 
>>  On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>>>  Hi, results of iostat are pretty much very similar on all nodes:
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
> 64.54
>>>  21.97   75.44   3.40 100.10
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
> 65.33
>>>  7.16 25.32 2.88  84.70
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
> 59.99
>>>  10.31   35.43   2.97  84.10
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
> 58.38
>>>  12.32   39.56   2.79  89.40
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
> 55.84
>>>  10.67   31.42   2.78  93.47
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
> 65.62
>>>  11.07   35.49   2.91  90.70
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
> 61.66
>>>  9.38 26.63 2.57  91.40
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
> 65.05
>>>  13.37   51.24   3.64  93.90
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
> 60.88
>>>  5.87   24.53   3.14  77.30
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
> 62.29
>>>  13.02   42.40   3.12  93.60
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
> 67.32
>>>  11.30   39.51   3.36  98.00
>>> 
>>>  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz
>>>  avgqu-sz   await  svctm  %util
>>>  xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
> 61.74
>>>  16.07   55.72   3.39  91.30
>>> 
>>> 
>>>  Jack Levin wrote:
>>>>  Please email iostat -xdm 1, run for one minute during load on each 
> node
>>>>  --
>>>>  Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>> 
>>>>  ijanitran<ta...@yahoo.com>  wrote:
>>>> 
>>>> 
>>>>  I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
> XLarge
>>>>  instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
> HRegion
>>>>  servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
> separate XLarge
>>>>  instance. Target dataset is 100 millions records (each record is 10 
> fields
>>>>  by 100 bytes). Benchmarking performed concurrently from parallel 
> 100
>>>>  threads.
>>>> 
>>>>  I'm confused with a read latency I got, comparing to what YCSB 
> team
>>>>  achieved
>>>>  and showed in their YCSB paper. They achieved throughput of up to 
> 7000
>>>>  ops/sec with a latency of 15 ms (page 10, read latency chart). I 
> can't get
>>>>  throughput higher than 2000 ops/sec on 90% reads/10% writes 
> workload.
>>>>  Writes
>>>>  are really fast with auto commit disabled (response within a few 
> ms),
>>>>  while
>>>>  read latency doesn't go lower than 70 ms in average.
>>>> 
>>>>  These are some HBase settings I used:
>>>> 
>>>>  hbase.regionserver.handler.count=50
>>>>  hfile.block.cache.size=0.4
>>>>  hbase.hregion.max.filesize=1073741824
>>>>  hbase.regionserver.codecs=lzo
>>>>  hbase.hregion.memstore.mslab.enabled=true
>>>>  hfile.min.blocksize.size=16384
>>>>  hbase.hregion.memstore.block.multiplier=4
>>>>  hbase.regionserver.global.memstore.upperLimit=0.35
>>>>  hbase.zookeeper.property.maxClientCnxns=100
>>>> 
>>>>  Which settings do you recommend to look at\tune to speed up 
> reads with
>>>>  HBase?
>>>> 
>>>>  --
>>>>  View this message in context:
>>>> 
> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> 
>>>  --
>>>  View this message in context: 
> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>  Sent from the HBase User mailing list archive at Nabble.com.
>>> 
> 
> -- 
> Jeff Whiting
> Qualtrics Senior Software Engineer
> jeffw@qualtrics.com
>

Re: Speeding up HBase read response

Posted by Jeff Whiting <je...@qualtrics.com>.

Do you have bloom filters enabled?  And compression?  Both of those can help reduce disk io load 
which seems to be the main issue you are having on the ec2 cluster.

~Jeff

On 4/9/2012 8:28 AM, Jack Levin wrote:
> Yes, from  %util you can see that your disks are working at 100%
> pretty much.  Which means you can't push them go any faster.   So the
> solution is to add more disks, add faster disks, add nodes and disks.
> This type of overload should not be related to HBASE, but rather to
> your hardware setup.
>
> -Jack
>
> On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<ta...@yahoo.com>  wrote:
>> Hi, results of iostat are pretty much very similar on all nodes:
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    64.54
>> 21.97   75.44   3.40 100.10
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    65.33
>> 7.16 25.32 2.88  84.70
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    59.99
>> 10.31   35.43   2.97  84.10
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    58.38
>> 12.32   39.56   2.79  89.40
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    55.84
>> 10.67   31.42   2.78  93.47
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    65.62
>> 11.07   35.49   2.91  90.70
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    61.66
>> 9.38 26.63 2.57  91.40
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    65.05
>> 13.37   51.24   3.64  93.90
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    60.88
>> 5.87   24.53   3.14  77.30
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    62.29
>> 13.02   42.40   3.12  93.60
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    67.32
>> 11.30   39.51   3.36  98.00
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    61.74
>> 16.07   55.72   3.39  91.30
>>
>>
>> Jack Levin wrote:
>>> Please email iostat -xdm 1, run for one minute during load on each node
>>> --
>>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>
>>> ijanitran<ta...@yahoo.com>  wrote:
>>>
>>>
>>> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
>>> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
>>> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
>>> instance. Target dataset is 100 millions records (each record is 10 fields
>>> by 100 bytes). Benchmarking performed concurrently from parallel 100
>>> threads.
>>>
>>> I'm confused with a read latency I got, comparing to what YCSB team
>>> achieved
>>> and showed in their YCSB paper. They achieved throughput of up to 7000
>>> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
>>> throughput higher than 2000 ops/sec on 90% reads/10% writes workload.
>>> Writes
>>> are really fast with auto commit disabled (response within a few ms),
>>> while
>>> read latency doesn't go lower than 70 ms in average.
>>>
>>> These are some HBase settings I used:
>>>
>>> hbase.regionserver.handler.count=50
>>> hfile.block.cache.size=0.4
>>> hbase.hregion.max.filesize=1073741824
>>> hbase.regionserver.codecs=lzo
>>> hbase.hregion.memstore.mslab.enabled=true
>>> hfile.min.blocksize.size=16384
>>> hbase.hregion.memstore.block.multiplier=4
>>> hbase.regionserver.global.memstore.upperLimit=0.35
>>> hbase.zookeeper.property.maxClientCnxns=100
>>>
>>> Which settings do you recommend to look at\tune to speed up reads with
>>> HBase?
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>
>>>
>>>
>> --
>> View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>

-- 
Jeff Whiting
Qualtrics Senior Software Engineer
jeffw@qualtrics.com

Re: Speeding up HBase read response

Posted by Jack Levin <ma...@gmail.com>.

Yes, from  %util you can see that your disks are working at 100%
pretty much.  Which means you can't push them go any faster.   So the
solution is to add more disks, add faster disks, add nodes and disks.
This type of overload should not be related to HBASE, but rather to
your hardware setup.

-Jack

On Mon, Apr 9, 2012 at 2:29 AM, ijanitran <ta...@yahoo.com> wrote:
>
> Hi, results of iostat are pretty much very similar on all nodes:
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    64.54
> 21.97   75.44   3.40 100.10
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    65.33
> 7.16 25.32 2.88  84.70
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    59.99
> 10.31   35.43   2.97  84.10
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    58.38
> 12.32   39.56   2.79  89.40
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    55.84
> 10.67   31.42   2.78  93.47
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    65.62
> 11.07   35.49   2.91  90.70
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    61.66
> 9.38 26.63 2.57  91.40
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    65.05
> 13.37   51.24   3.64  93.90
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    60.88
> 5.87   24.53   3.14  77.30
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    62.29
> 13.02   42.40   3.12  93.60
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    67.32
> 11.30   39.51   3.36  98.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    61.74
> 16.07   55.72   3.39  91.30
>
>
> Jack Levin wrote:
>>
>> Please email iostat -xdm 1, run for one minute during load on each node
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>
>> ijanitran <ta...@yahoo.com> wrote:
>>
>>
>> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
>> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
>> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
>> instance. Target dataset is 100 millions records (each record is 10 fields
>> by 100 bytes). Benchmarking performed concurrently from parallel 100
>> threads.
>>
>> I'm confused with a read latency I got, comparing to what YCSB team
>> achieved
>> and showed in their YCSB paper. They achieved throughput of up to 7000
>> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
>> throughput higher than 2000 ops/sec on 90% reads/10% writes workload.
>> Writes
>> are really fast with auto commit disabled (response within a few ms),
>> while
>> read latency doesn't go lower than 70 ms in average.
>>
>> These are some HBase settings I used:
>>
>> hbase.regionserver.handler.count=50
>> hfile.block.cache.size=0.4
>> hbase.hregion.max.filesize=1073741824
>> hbase.regionserver.codecs=lzo
>> hbase.hregion.memstore.mslab.enabled=true
>> hfile.min.blocksize.size=16384
>> hbase.hregion.memstore.block.multiplier=4
>> hbase.regionserver.global.memstore.upperLimit=0.35
>> hbase.zookeeper.property.maxClientCnxns=100
>>
>> Which settings do you recommend to look at\tune to speed up reads with
>> HBase?
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
> Sent from the HBase User mailing list archive at Nabble.com.
>

Re: Speeding up HBase read response

Posted by ijanitran <ta...@yahoo.com>.

Hi, results of iostat are pretty much very similar on all nodes:

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    64.54   
21.97   75.44   3.40 100.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    65.33    
7.16   25.32   2.88  84.70

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    59.99   
10.31   35.43   2.97  84.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    58.38   
12.32   39.56   2.79  89.40

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    55.84   
10.67   31.42   2.78  93.47

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    65.62   
11.07   35.49   2.91  90.70

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    61.66    
9.38   26.63   2.57  91.40

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    65.05   
13.37   51.24   3.64  93.90

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    60.88    
5.87   24.53   3.14  77.30

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    62.29   
13.02   42.40   3.12  93.60

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    67.32   
11.30   39.51   3.36  98.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    61.74   
16.07   55.72   3.39  91.30


Jack Levin wrote:
> 
> Please email iostat -xdm 1, run for one minute during load on each node
> -- 
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> 
> ijanitran <ta...@yahoo.com> wrote:
> 
> 
> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
> instance. Target dataset is 100 millions records (each record is 10 fields
> by 100 bytes). Benchmarking performed concurrently from parallel 100
> threads.
> 
> I'm confused with a read latency I got, comparing to what YCSB team
> achieved
> and showed in their YCSB paper. They achieved throughput of up to 7000
> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
> throughput higher than 2000 ops/sec on 90% reads/10% writes workload.
> Writes
> are really fast with auto commit disabled (response within a few ms),
> while
> read latency doesn't go lower than 70 ms in average.
> 
> These are some HBase settings I used:
> 
> hbase.regionserver.handler.count=50
> hfile.block.cache.size=0.4
> hbase.hregion.max.filesize=1073741824
> hbase.regionserver.codecs=lzo
> hbase.hregion.memstore.mslab.enabled=true
> hfile.min.blocksize.size=16384
> hbase.hregion.memstore.block.multiplier=4
> hbase.regionserver.global.memstore.upperLimit=0.35
> hbase.zookeeper.property.maxClientCnxns=100 
> 
> Which settings do you recommend to look at\tune to speed up reads with
> HBase?
> 
> -- 
> View this message in context:
> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
> Sent from the HBase User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Speeding up HBase read response

Posted by Jack Levin <ma...@gmail.com>.

Please email iostat -xdm 1, run for one minute during load on each node
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

ijanitran <ta...@yahoo.com> wrote:

I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge
instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for HRegion
servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the separate XLarge
instance. Target dataset is 100 millions records (each record is 10 fields
by 100 bytes). Benchmarking performed concurrently from parallel 100
threads.

I'm confused with a read latency I got, comparing to what YCSB team achieved
and showed in their YCSB paper. They achieved throughput of up to 7000
ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
are really fast with auto commit disabled (response within a few ms), while
read latency doesn't go lower than 70 ms in average.

These are some HBase settings I used:

hbase.regionserver.handler.count=50
hfile.block.cache.size=0.4
hbase.hregion.max.filesize=1073741824
hbase.regionserver.codecs=lzo
hbase.hregion.memstore.mslab.enabled=true
hfile.min.blocksize.size=16384
hbase.hregion.memstore.block.multiplier=4
hbase.regionserver.global.memstore.upperLimit=0.35
hbase.zookeeper.property.maxClientCnxns=100

Which settings do you recommend to look at\tune to speed up reads with
HBase?

--
View this message in context: http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
Sent from the HBase User mailing list archive at Nabble.com.