You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Asaf Mesika <as...@gmail.com> on 2013/11/01 09:27:01 UTC

Re: HBase Random Read latency > 100ms

How many Parallel GC were you using?

Regarding block cache - just to see I understood this right: if your are
doing a massive read in HBase it's better to turn off block caching through
the Scan attribute?

On Thursday, October 10, 2013, Otis Gospodnetic wrote:

> Hi Ramu,
>
> I think I saw mentions of this possibly being a GC issue.... though
> now it seems it may be a disk IO issue?
>
> 3 things:
> 1) http://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
> - our G1 experience, with HBase specificallytrivute
> 2) If you can share some of your performance graphs (GC, disk IO, JVM
> memory pools, HBase specific ones, etc.) people will likely be able to
> provide better help
> 3) You can do 2) with SPM (see sig), and actually you can send email
> to this ML with your graphs directly from SPM. :)
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Wed, Oct 9, 2013 at 3:11 AM, Ramu M S <ra...@gmail.com> wrote:
> > Hi All,
> >
> > Sorry. There was some mistake in the tests (Clients were not reduced,
> > forgot to change the parameter before running tests).
> >
> > With 8 Clients and,
> >
> > SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8
> > SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2
> >
> > Still, SCR disabled gives better results, which confuse me. Can anyone
> > clarify?
> >
> > Also, I tried setting the parameter (hbase.regionserver.checksum.verify
> as
> > true) Lars suggested with SCR disabled.
> > Average Latency is around 9.8 ms, a fraction lesser.
> >
> > Thanks
> > Ramu
> >
> >
> > On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <ra...@gmail.com> wrote:
> >
> >> Hi All,
> >>
> >> I just ran only 8 parallel clients,
> >>
> >> With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
> >> With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2
> >>
> >> I always thought SCR enabled, allows a client co-located with the
> DataNode
> >> to read HDFS file blocks directly. This gives a performance boost to
> >> distributed clients that are aware of locality.
> >>
> >> Is my understanding wrong OR it doesn't apply to my scenario?
> >>
> >> Meanwhile I will try setting the parameter suggested by Lars and post
> you
> >> the results.
> >>
> >> Thanks,
> >> Ramu
> >>
> >>
> >> On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <la...@apache.org> wrote:
> >>
> >>> Good call.
> >>> Could try to enable hbase.regionserver.checksum.verify, which will
> cause
> >>> HBase to do its own checksums rather than relying on HDFS (and which
> saves
> >>> 1 IO per block get).
> >>>
> >>> I do think you can expect the index blocks to be cached at all times.
> >>>
> >>> -- Lars
> >>> ________________________________
> >>> From: Vladimir Rodionov <vr...@carrieriq.com>
> >>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> >>> Sent: Tuesday, October 8, 2013 8:44 PM
> >>> Subject: RE: HBase Random Read latency > 100ms
> >>>
> >>>
> >>> Upd.
> >>>
> >>> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
> >>> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
> >>> it is going to be 6 File IO in a worst case (large data set), therefore
> >>> you will have not 5 IO requests in queue but up to 20-30 IO requests
> in a
> >>> queue
> >>> This definitely explains > 100ms avg latency.
> >>>
> >>>
> >>>
> >>> Best regards,
> >>> Vladimir Rodionov
> >>> Principal Platform Engineer
> >>> Carrier IQ, www.carrieriq.com
> >>> e-mail: vrodionov@carrieriq.com
> >>>
> >>> ________________________________________
> >>>
> >>> From: Vladimir Rodionov
> >>> Sent: Tuesday, October 08, 2013 7:24 PM
> >>> To: user@hbase.apache.org
> >>> Subject: RE: HBase Random Read latency > 100ms
> >>>
> >>> Ramu,
> >>>
> >>> You have 8 server boxes and 10 client. You have 40 requests in
> parallel -
> >>> 5 per RS/DN?
> >>>
> >>> You have 5 requests on random reads in a IO queue of your single RAID1.
> >>> With avg read latency of 10 ms, 5 requests in queue will give us 30ms.
> Add
> >>> some overhead
> >>> of HDFS + HBase and you will probably have your issue explained ?
> >>>
> >>> Your bottleneck is your disk system, I think. When you serve most of
>