You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Dmitriy Lyubimov <dl...@apache.org> on 2011/04/20 01:46:06 UTC

0.90 latency performance, cdh3b4

Hi,

I would like to see how i can attack hbase performance.

Right now i am shooting scans returning between 3 and 40 rows and
regardless of data size, approximately 500-400 QPS. The data tables
are almost empty and in-memory, so they surely should fit in those 40%
heap dedicated to them.

My local 1-node test shows read times between 1 and 2 ms. Great.

As soon as i go to our 10-node cluster, the response times drop to
25ms per scan, regardless of # of records. I set scan block cache size
to 100 (rows?), otherwise i was getting outrages numbers reaching as
far out as 300-400ms.

It's my understanding the timing should be actually still much closer
to my local tests than to 25ms.

So... how do i attack this ? increase regionserver handler count? What
the latency should i be able to reach for extremely small data records
(<=200bytes)?

(CDH3b4). HBase debug logging switched off.

Thanks in advance.
-Dmitriy

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Yes, this is the only stress test running on the cluster and nothing else.
And it all goes to block cache as evidenced by metrics.

2011-04-20 12:28:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
evicted=0, evictedPerRun=NaN


On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>> Right now i am shooting scans returning between 3 and 40 rows and
>>> regardless of data size, approximately 500-400 QPS. The data tables
>>> are almost empty and in-memory, so they surely should fit in those 40%
>>> heap dedicated to them.
>>>
>>
>> How many clients are going against the cluster?  If you use less, do
>> your numbers improve?
>>
>
> And all these clients are going against a single 40 row table?
> St.Ack
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

yes that was closer to my expectations, too. i am scratching my head
as well but i don't have time to figure this out any longer. in
reality i won't have 500QPS stream between single client and single
region so i don't care much.

On Thu, Apr 21, 2011 at 11:08 PM, Ted Dunning <td...@maprtech.com> wrote:
> This actually sounds like there is a problem with concurrency either on the
> client or the server side.  TCP is plenty fast for this and having a
> dedicated TCP connection over which multiple requests can be multiplexed is
> probably much better than UDP because you would have to adapt your own
> window loss recovery anyway.   Having a long-lived TCP channel lets you
> benefit from the decades of research in how to make that work right.
>
> Hadoop rpc allows multiple outstanding requests at once so that isn't
> inherently the problem either.  I feel like I have a memory of null requests
> taking < 1 ms with Hadoop RPC, but I can't place where that memory might
> have come from.
>
> Also, I can push > 20,000 transactions per second through 20 threads in YCSB
> and average latencies on those threads are often < 5 ms and sometimes near
> 1ms.
>
> My first suspicion would be a concurrency limit somewhere that is
> artificially throttling things down.  Why it would be sooo extreme, I cannot
> imagine.
>
> On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> So of course this test is stupid becuase in reality nobody would scan
>> a table with 40 rows. So all the traffic goes to a single region
>> server, so with a relatively low stress we could get an idea how the
>> rest of the cluster would behave with proportionally higher load.
>>
>> Anyway. For a million requests shot at a region server at various
>> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
>> are arctually good -- no more than 1ms average per next() and 0 per
>> get(). So region server is lightning fast.
>>
>> What doesn't seem so fast is RPC. As i reported before, i was getting
>> 25ms TTLB under the circumstances. In this case all the traffic to the
>> node goes thru same client (but in reality of course the node's
>> portion per client should be much less). All that traffic is using
>> single regionserver node rpc queue as HConnection would not open more
>> than one socket to same region. And tcp doesn't seem to perform very
>> well for some reason in this scenario.
>>
>> So, it seems to help to actually open multiple hbase connections and
>> round-robin them between scans. that way even though we waste more
>> zookeeper connections, we also have more than one rpc channel open for
>> the high-traffic region as well. A little coding and it brings us down
>> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
>> connections  Perhaps normally it is not as much a problem as traffic
>> is more uniformly distributed among regions from the same client.
>>
>> The next thing i did was to enable tcp_nodelay on both client and
>> server. That got us down even more to 13ms average.
>>
>> However, it is still about two times slower if i run all processes at
>> the same machine (i get around 6-7ms average TTLBs for the same type
>> of scan).
>>
>> Ping time for about same packet size between hosts involved seems to
>> revolve around 1ms. Where another 5ms average time are getting lost is
>> still a mystery. But oh well i guess it is as good as it gets.
>> In real life hbase applications traffic would be much more uniformly
>> distributed among regions and this would be much less of an issue
>> perhaps.
>>
>> I also suspect that using udp for short scans and gets might reduce
>> latency a bit as well.
>>
>> On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> > So i can't seem to be able to immediately find the explanation for those
>> metrics
>> >
>> > - rpcQueueTime -- do I assume it correctly it's the time a request
>> > sits waiting int the incoming rpc queue before being picked up by
>> > handler ?
>> >
>> > -rpcProcessingTime -- do i assume it correctly it's time of request
>> > being processed by region server's handler?
>> >
>> > So inner time to last byte should be approximately sum of those, right?
>> >
>> > Thanks.
>> > -Dmitriy
>> >
>> > On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >> Yes that's what i said. there's metric for fs latency but we are not
>> >> hitting it so it's not useful.
>> >>
>> >> Question is which one might be useful to measure inner ttlb, and i
>> >> don't see it there.
>> >>
>> >> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com>
>> wrote:
>> >>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>> >>>
>> >>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
>> >wrote:
>> >>>
>> >>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>> >>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>> >>>>
>> >>>> There are fs latency metrics there but nothing for the respons times.
>> >>>> fs latency is essentially hdfs latency AFAICT and that would not be
>> >>>> relevant to what i am asking for (for as long as we are hitting LRU
>> >>>> block cache anyway). we are not hitting fs.
>> >>>>
>> >>>> Unless there are more metrics than listed in the Hbase Book?
>> >>>>
>> >>>>
>> >>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>> >>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>> >>>> > on hbase home page.
>> >>>> >
>> >>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <
>> dlieu.7@gmail.com>
>> >>>> wrote:
>> >>>> >> Is there any way to log 'inner' TTLB times the region server incurs
>> for
>> >>>> reads?
>> >>>> >>
>> >>>> >>
>> >>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <
>> dlieu.7@gmail.com>
>> >>>> wrote:
>> >>>> >>> i just enabled debug logging for o.a.h.hbase logger in that
>> particular
>> >>>> >>> region server... so far not much except for LRUBlock cache
>> spitting
>> >>>> >>> metrics ..
>> >>>> >>>
>> >>>> >>> 2011-04-20 12:28:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>> >>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>> >>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>> 2011-04-20 12:33:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>> >>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>> >>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>> 2011-04-20 12:38:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>> >>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>> >>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>> >>>> >>>> If one region only, then its located on a single regionserver.
>>  Tail
>> >>>> >>>> that regionservers logs.  It might tell us something.
>> >>>> >>>> St.Ack
>> >>>> >>>>
>> >>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
>> wrote:
>> >>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
>> wrote:
>> >>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>> >>>> dlyubimov@apache.org> wrote:
>> >>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows
>> and
>> >>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data
>> tables
>> >>>> >>>>>>> are almost empty and in-memory, so they surely should fit in
>> those
>> >>>> 40%
>> >>>> >>>>>>> heap dedicated to them.
>> >>>> >>>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> How many clients are going against the cluster?  If you use
>> less, do
>> >>>> >>>>>> your numbers improve?
>> >>>> >>>>>>
>> >>>> >>>>>
>> >>>> >>>>> And all these clients are going against a single 40 row table?
>> >>>> >>>>> St.Ack
>> >>>> >>>>>
>> >>>> >>>>
>> >>>> >>>
>> >>>> >>
>> >>>> >
>> >>>>
>> >>>
>> >>
>> >
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

This actually sounds like there is a problem with concurrency either on the
client or the server side.  TCP is plenty fast for this and having a
dedicated TCP connection over which multiple requests can be multiplexed is
probably much better than UDP because you would have to adapt your own
window loss recovery anyway.   Having a long-lived TCP channel lets you
benefit from the decades of research in how to make that work right.

Hadoop rpc allows multiple outstanding requests at once so that isn't
inherently the problem either.  I feel like I have a memory of null requests
taking < 1 ms with Hadoop RPC, but I can't place where that memory might
have come from.

Also, I can push > 20,000 transactions per second through 20 threads in YCSB
and average latencies on those threads are often < 5 ms and sometimes near
1ms.

My first suspicion would be a concurrency limit somewhere that is
artificially throttling things down.  Why it would be sooo extreme, I cannot
imagine.

On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> So of course this test is stupid becuase in reality nobody would scan
> a table with 40 rows. So all the traffic goes to a single region
> server, so with a relatively low stress we could get an idea how the
> rest of the cluster would behave with proportionally higher load.
>
> Anyway. For a million requests shot at a region server at various
> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
> are arctually good -- no more than 1ms average per next() and 0 per
> get(). So region server is lightning fast.
>
> What doesn't seem so fast is RPC. As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should be much less). All that traffic is using
> single regionserver node rpc queue as HConnection would not open more
> than one socket to same region. And tcp doesn't seem to perform very
> well for some reason in this scenario.
>
> So, it seems to help to actually open multiple hbase connections and
> round-robin them between scans. that way even though we waste more
> zookeeper connections, we also have more than one rpc channel open for
> the high-traffic region as well. A little coding and it brings us down
> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
> connections  Perhaps normally it is not as much a problem as traffic
> is more uniformly distributed among regions from the same client.
>
> The next thing i did was to enable tcp_nodelay on both client and
> server. That got us down even more to 13ms average.
>
> However, it is still about two times slower if i run all processes at
> the same machine (i get around 6-7ms average TTLBs for the same type
> of scan).
>
> Ping time for about same packet size between hosts involved seems to
> revolve around 1ms. Where another 5ms average time are getting lost is
> still a mystery. But oh well i guess it is as good as it gets.
> In real life hbase applications traffic would be much more uniformly
> distributed among regions and this would be much less of an issue
> perhaps.
>
> I also suspect that using udp for short scans and gets might reduce
> latency a bit as well.
>
> On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> > So i can't seem to be able to immediately find the explanation for those
> metrics
> >
> > - rpcQueueTime -- do I assume it correctly it's the time a request
> > sits waiting int the incoming rpc queue before being picked up by
> > handler ?
> >
> > -rpcProcessingTime -- do i assume it correctly it's time of request
> > being processed by region server's handler?
> >
> > So inner time to last byte should be approximately sum of those, right?
> >
> > Thanks.
> > -Dmitriy
> >
> > On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >> Yes that's what i said. there's metric for fs latency but we are not
> >> hitting it so it's not useful.
> >>
> >> Question is which one might be useful to measure inner ttlb, and i
> >> don't see it there.
> >>
> >> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com>
> wrote:
> >>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
> >>>
> >>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >>>
> >>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
> >>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
> >>>>
> >>>> There are fs latency metrics there but nothing for the respons times.
> >>>> fs latency is essentially hdfs latency AFAICT and that would not be
> >>>> relevant to what i am asking for (for as long as we are hitting LRU
> >>>> block cache anyway). we are not hitting fs.
> >>>>
> >>>> Unless there are more metrics than listed in the Hbase Book?
> >>>>
> >>>>
> >>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
> >>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
> >>>> > on hbase home page.
> >>>> >
> >>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <
> dlieu.7@gmail.com>
> >>>> wrote:
> >>>> >> Is there any way to log 'inner' TTLB times the region server incurs
> for
> >>>> reads?
> >>>> >>
> >>>> >>
> >>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <
> dlieu.7@gmail.com>
> >>>> wrote:
> >>>> >>> i just enabled debug logging for o.a.h.hbase logger in that
> particular
> >>>> >>> region server... so far not much except for LRUBlock cache
> spitting
> >>>> >>> metrics ..
> >>>> >>>
> >>>> >>> 2011-04-20 12:28:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
> >>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
> >>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>> 2011-04-20 12:33:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
> >>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
> >>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>> 2011-04-20 12:38:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
> >>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
> >>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>>
> >>>> >>>
> >>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
> >>>> >>>> If one region only, then its located on a single regionserver.
>  Tail
> >>>> >>>> that regionservers logs.  It might tell us something.
> >>>> >>>> St.Ack
> >>>> >>>>
> >>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
> wrote:
> >>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
> wrote:
> >>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
> >>>> dlyubimov@apache.org> wrote:
> >>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows
> and
> >>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data
> tables
> >>>> >>>>>>> are almost empty and in-memory, so they surely should fit in
> those
> >>>> 40%
> >>>> >>>>>>> heap dedicated to them.
> >>>> >>>>>>>
> >>>> >>>>>>
> >>>> >>>>>> How many clients are going against the cluster?  If you use
> less, do
> >>>> >>>>>> your numbers improve?
> >>>> >>>>>>
> >>>> >>>>>
> >>>> >>>>> And all these clients are going against a single 40 row table?
> >>>> >>>>> St.Ack
> >>>> >>>>>
> >>>> >>>>
> >>>> >>>
> >>>> >>
> >>>> >
> >>>>
> >>>
> >>
> >
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Exactly. that's why i said 'for short scans and gets' and perhaps a
combo. As soon as it exceeds a frame, we'd rather not to mess with
reassembly. But I agree it is most likely not worth it. Most likely
reason for my latencies is not this.

On Thu, Apr 21, 2011 at 11:22 PM, Ted Dunning <td...@maprtech.com> wrote:
> Yeah... but with UDP you have to do packet reassembly yourself.
>
> And do source quench and all kinds of things.
>
> Been there.  Done that.  Don't recommend it unless it is your day job.
>
> We built the Veoh peer to peer system on UDP.  It had compelling advantages
> for us as we moved a terabit of data per second 5 years ago, but it was
> distinctly non-trivial to get right.  The benefits we had included:
>
> - we could make our flows very aggressive, but less aggressive than TCP.
> That made them feel like smoke relative to web-surfing.  (not a benefit
> here)
>
> - we could handle thousands of connections if necessary (not a benefit here)
>
> - we could penetrate firewalls more easily by state spoofing (not a benefit
> here)
>
> - our protocol did magical window reassembly from multiple sources (not a
> benefit here)
>
> But getting this to work was weeks of work and months of testing with
> thousands of different clients.  I wouldn't want to repeat that without
> serious reasons.
>
>
> On Thu, Apr 21, 2011 at 11:12 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> > What do you see here D?
>>
>> I am not sure. I am not very good at understanding network frames. but
>> tcp kind of spends a lot of resources to ensure the flow. While udp
>> wouldn't bother with all that nonsense.
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

Yeah... but with UDP you have to do packet reassembly yourself.

And do source quench and all kinds of things.

Been there.  Done that.  Don't recommend it unless it is your day job.

We built the Veoh peer to peer system on UDP.  It had compelling advantages
for us as we moved a terabit of data per second 5 years ago, but it was
distinctly non-trivial to get right.  The benefits we had included:

- we could make our flows very aggressive, but less aggressive than TCP.
That made them feel like smoke relative to web-surfing.  (not a benefit
here)

- we could handle thousands of connections if necessary (not a benefit here)

- we could penetrate firewalls more easily by state spoofing (not a benefit
here)

- our protocol did magical window reassembly from multiple sources (not a
benefit here)

But getting this to work was weeks of work and months of testing with
thousands of different clients.  I wouldn't want to repeat that without
serious reasons.

On Thu, Apr 21, 2011 at 11:12 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> > What do you see here D?
>
> I am not sure. I am not very good at understanding network frames. but
> tcp kind of spends a lot of resources to ensure the flow. While udp
> wouldn't bother with all that nonsense.
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

yes this is for 500 QPS of scans returning back approx. 15k worth of data total.

>
> You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that help?
Interesting. let me take a look. i kind of was thinking maybe there's
some sense to allow to pool more than one tcp connection from same
client to same region server, perhaps even detect skewed distributions
with some sort of exponentially decayed meters.

>
>
>> And tcp doesn't seem to perform very
>> well for some reason in this scenario.
>>
>
> What do you see here D?

I am not sure. I am not very good at understanding network frames. but
tcp kind of spends a lot of resources to ensure the flow. While udp
wouldn't bother with all that nonsense.

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

in this case i pool them as well, which doesn't seem to make any
difference (compared to when i just reuse them -- but i am not writing
but outside of the test i do so i do pool them using techniques
similar to those in HTablePool, CAS-based queues etc. )


On Thu, Apr 21, 2011 at 11:09 PM, Ted Dunning <td...@maprtech.com> wrote:
> Dmitriy,
>
> Did I hear you say that you are instantiating a new Htable for each request?
>  Or was that somebody else?
>
> On Thu, Apr 21, 2011 at 11:04 PM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> > Anyway. For a million requests shot at a region server at various
>> > speeds between 300 and 500 qps the picture is not pretty. RPC metrics
>> > are arctually good -- no more than 1ms average per next() and 0 per
>> > get(). So region server is lightning fast.
>> >
>>
>> This is 3-500 queries per second of 40 rows each?
>>
>> > What doesn't seem so fast is RPC.
>>
>> OK.
>>
>> > As i reported before, i was getting
>> > 25ms TTLB under the circumstances. In this case all the traffic to the
>> > node goes thru same client (but in reality of course the node's
>> > portion per client should be much less). All that traffic is using
>> > single regionserver node rpc queue as HConnection would not open more
>> > than one socket to same region.
>>
>> You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that
>> help?
>>
>>
>> > And tcp doesn't seem to perform very
>> > well for some reason in this scenario.
>> >
>>
>> What do you see here D?
>>
>>
>> > The next thing i did was to enable tcp_nodelay on both client and
>> > server. That got us down even more to 13ms average.
>> >
>>
>> Thats a big difference.
>>
>>
>> > However, it is still about two times slower if i run all processes at
>> > the same machine (i get around 6-7ms average TTLBs for the same type
>> > of scan).
>> >
>> > Ping time for about same packet size between hosts involved seems to
>> > revolve around 1ms. Where another 5ms average time are getting lost is
>> > still a mystery. But oh well i guess it is as good as it gets.
>> > In real life hbase applications traffic would be much more uniformly
>> > distributed among regions and this would be much less of an issue
>> > perhaps.
>> >
>> > I also suspect that using udp for short scans and gets might reduce
>> > latency a bit as well.
>> >
>>
>> Thank you Dmitriy for digging in.  Good stuff.
>> St.Ack
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

Dmitriy,

Did I hear you say that you are instantiating a new Htable for each request?
 Or was that somebody else?

On Thu, Apr 21, 2011 at 11:04 PM, Stack <st...@duboce.net> wrote:

> On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> > Anyway. For a million requests shot at a region server at various
> > speeds between 300 and 500 qps the picture is not pretty. RPC metrics
> > are arctually good -- no more than 1ms average per next() and 0 per
> > get(). So region server is lightning fast.
> >
>
> This is 3-500 queries per second of 40 rows each?
>
> > What doesn't seem so fast is RPC.
>
> OK.
>
> > As i reported before, i was getting
> > 25ms TTLB under the circumstances. In this case all the traffic to the
> > node goes thru same client (but in reality of course the node's
> > portion per client should be much less). All that traffic is using
> > single regionserver node rpc queue as HConnection would not open more
> > than one socket to same region.
>
> You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that
> help?
>
>
> > And tcp doesn't seem to perform very
> > well for some reason in this scenario.
> >
>
> What do you see here D?
>
>
> > The next thing i did was to enable tcp_nodelay on both client and
> > server. That got us down even more to 13ms average.
> >
>
> Thats a big difference.
>
>
> > However, it is still about two times slower if i run all processes at
> > the same machine (i get around 6-7ms average TTLBs for the same type
> > of scan).
> >
> > Ping time for about same packet size between hosts involved seems to
> > revolve around 1ms. Where another 5ms average time are getting lost is
> > still a mystery. But oh well i guess it is as good as it gets.
> > In real life hbase applications traffic would be much more uniformly
> > distributed among regions and this would be much less of an issue
> > perhaps.
> >
> > I also suspect that using udp for short scans and gets might reduce
> > latency a bit as well.
> >
>
> Thank you Dmitriy for digging in.  Good stuff.
> St.Ack
>

Re: 0.90 latency performance, cdh3b4

Posted by "Bakhru, Raj" <ra...@keposcapital.com>.

W

----- Original Message -----
From: Dmitriy Lyubimov [mailto:dlieu.7@gmail.com]
Sent: Friday, April 22, 2011 02:50 AM
To: user@hbase.apache.org <us...@hbase.apache.org>
Subject: Re: 0.90 latency performance, cdh3b4

>
> You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that help?

Ok just read thru the issue. That's exactly what i thought upon
reading the code in HBaseClient class. Although in my cluster it did
not seem to have more than about 20% effect and it was more or less
evaporated after 3 connections. (1 to 3 has noticeable jump but there
was no much difference between 3 and 10. But then again i suspect my
problem is network but not the threading -- although with a mix of
longer and shorter messages it may become an apparent nuisance
indeed).

>
>
>> And tcp doesn't seem to perform very
>> well for some reason in this scenario.
>>
>
> What do you see here D?
>
>
>> The next thing i did was to enable tcp_nodelay on both client and
>> server. That got us down even more to 13ms average.
>>
>
> Thats a big difference.
>
>
>> However, it is still about two times slower if i run all processes at
>> the same machine (i get around 6-7ms average TTLBs for the same type
>> of scan).
>>
>> Ping time for about same packet size between hosts involved seems to
>> revolve around 1ms. Where another 5ms average time are getting lost is
>> still a mystery. But oh well i guess it is as good as it gets.
>> In real life hbase applications traffic would be much more uniformly
>> distributed among regions and this would be much less of an issue
>> perhaps.
>>
>> I also suspect that using udp for short scans and gets might reduce
>> latency a bit as well.
>>
>
> Thank you Dmitriy for digging in.  Good stuff.
> St.Ack
>

This communication is intended only for the addressee(s), may contain confidential information, and may be protected by US and other laws. We do not waive any confidentiality by misdelivery. If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful.

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

>
> You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that help?

Ok just read thru the issue. That's exactly what i thought upon
reading the code in HBaseClient class. Although in my cluster it did
not seem to have more than about 20% effect and it was more or less
evaporated after 3 connections. (1 to 3 has noticeable jump but there
was no much difference between 3 and 10. But then again i suspect my
problem is network but not the threading -- although with a mix of
longer and shorter messages it may become an apparent nuisance
indeed).



>
>
>> And tcp doesn't seem to perform very
>> well for some reason in this scenario.
>>
>
> What do you see here D?
>
>
>> The next thing i did was to enable tcp_nodelay on both client and
>> server. That got us down even more to 13ms average.
>>
>
> Thats a big difference.
>
>
>> However, it is still about two times slower if i run all processes at
>> the same machine (i get around 6-7ms average TTLBs for the same type
>> of scan).
>>
>> Ping time for about same packet size between hosts involved seems to
>> revolve around 1ms. Where another 5ms average time are getting lost is
>> still a mystery. But oh well i guess it is as good as it gets.
>> In real life hbase applications traffic would be much more uniformly
>> distributed among regions and this would be much less of an issue
>> perhaps.
>>
>> I also suspect that using udp for short scans and gets might reduce
>> latency a bit as well.
>>
>
> Thank you Dmitriy for digging in.  Good stuff.
> St.Ack
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Anyway. For a million requests shot at a region server at various
> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
> are arctually good -- no more than 1ms average per next() and 0 per
> get(). So region server is lightning fast.
>

This is 3-500 queries per second of 40 rows each?

> What doesn't seem so fast is RPC.

OK.

> As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should be much less). All that traffic is using
> single regionserver node rpc queue as HConnection would not open more
> than one socket to same region.

You saw "HBASE-2939  Allow Client-Side Connection Pooling"?  Would that help?


> And tcp doesn't seem to perform very
> well for some reason in this scenario.
>

What do you see here D?


> The next thing i did was to enable tcp_nodelay on both client and
> server. That got us down even more to 13ms average.
>

Thats a big difference.


> However, it is still about two times slower if i run all processes at
> the same machine (i get around 6-7ms average TTLBs for the same type
> of scan).
>
> Ping time for about same packet size between hosts involved seems to
> revolve around 1ms. Where another 5ms average time are getting lost is
> still a mystery. But oh well i guess it is as good as it gets.
> In real life hbase applications traffic would be much more uniformly
> distributed among regions and this would be much less of an issue
> perhaps.
>
> I also suspect that using udp for short scans and gets might reduce
> latency a bit as well.
>

Thank you Dmitriy for digging in.  Good stuff.
St.Ack

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Thank you, sir.

On Fri, Apr 22, 2011 at 12:31 PM, tsuna <ts...@gmail.com> wrote:
> On Fri, Apr 22, 2011 at 12:15 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> is it possible to configure this client to open more than one socket
>> connection from same client to same region server?
>> In other words, is HBASE-2939 already non-issue there?
>
> No asynchbase doesn't have HBASE-2939, but as I said, I haven't seen a
> case (yet) where something like that is needed.  In my experience
> people see performance gains when pooling connections mostly when the
> RPC protocol or implementation is inefficient.  Try asynchbase, it's
> possible that it'll perform better, it's a completely different
> implementation.  It certainly did for me.
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: 0.90 latency performance, cdh3b4

Posted by tsuna <ts...@gmail.com>.

On Fri, Apr 22, 2011 at 12:15 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> is it possible to configure this client to open more than one socket
> connection from same client to same region server?
> In other words, is HBASE-2939 already non-issue there?

No asynchbase doesn't have HBASE-2939, but as I said, I haven't seen a
case (yet) where something like that is needed.  In my experience
people see performance gains when pooling connections mostly when the
RPC protocol or implementation is inefficient.  Try asynchbase, it's
possible that it'll perform better, it's a completely different
implementation.  It certainly did for me.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Benoit,

Thank you.

is it possible to configure this client to open more than one socket
connection from same client to same region server?
In other words, is HBASE-2939 already non-issue there?

> asynchbase implements the HBase RPC protocol in a different way, it's
> written from scratch.  It uses Netty and is fully asynchronous and
> non-blocking.  That's where the efficiency comes from.  At StumbleUpon
> I've used it to push 200,000 edits/s to just 3 RegionServers.  I never
> got even close to this with HTable.
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: 0.90 latency performance, cdh3b4

Posted by tsuna <ts...@gmail.com>.

On Thu, Apr 21, 2011 at 11:25 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> I certainly would. Even more, i already read the code there  just a
> bit although not enough to understand where the efficiency comes from.
> Do you actually implement another version of RPC on non-blocking
> sockets there?

asynchbase implements the HBase RPC protocol in a different way, it's
written from scratch.  It uses Netty and is fully asynchronous and
non-blocking.  That's where the efficiency comes from.  At StumbleUpon
I've used it to push 200,000 edits/s to just 3 RegionServers.  I never
got even close to this with HTable.

> Unfortunately i can't do much more than just 'try' any time soon as
> the codebase is rather tightly coupled with current client. Migrating
> code and tests would not be a one day effort. However, i certainly can
> try with one test.

Right, asynchbase has a different API than HTable, so it's not a
drop-in replacement.  I assumed that since you were talking about a
short 40-row scan you could write a little Java program with a small
main() that reproduces the issue, and then it would be much easier to
try with asynchbase since you'd need to change about 5-10 lines of
code only.

> But if only i could solve the mysterious tcp lag in the datacenter, i
> suspect my needs would be quite well covered with the standard client
> as well.

I doubt you have a "mysterious TCP lag".  Please provide a trace with
tcpdump if you still believe you do.

On Thu, Apr 21, 2011 at 11:42 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Exactly. that's why i said 'for short scans and gets' and perhaps a
> combo. As soon as it exceeds a frame, we'd rather not to mess with
> reassembly. But I agree it is most likely not worth it. Most likely
> reason for my latencies is not this.

It's not that simple.  Even if you get just one row, that row might be
bigger than 1460 bytes, you don't know.  The client cannot predict how
much data even a simple "get" might return.  You then potentially need
to handle packet re-ordering.  And either way you would also still
need to discover and handle packet loss.  I'm not even talking about
checksumming.

Modern TCP implementations are very efficient.  You really do not want
to use UDP to talk to a database, unless you spent a very significant
amount of engineering time building a reliable protocol on top of it,
as Ted mentioned.  In which case, in 99% of the cases, you're better
off with TCP anyway, because your reliable protocol is almost
certainly not going to be as reliable and efficient as, say, Linux's
TCP implementation.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

> I doubt that TCP doesn't perform well.  If you really believe so, can
> you provide a packet capture collected with:
> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020

Thanks, i will certainly try. However same class machine same data
same test locally vs. remote same subnet is de facto 100% difference.
I am just as puzzled. I did not expect this.

>
> Would you be open to trying asynchbase in your test case /
> application?  I haven't seen a case (yet) where you actually *need* to
> open multiple connections per RegionServer.  I expect that if the
> problem is an inefficiency in the HBase client, asynchbase might do
> better, and if it does not, its DEBUG logging level might shed some
> light on where the problem comes from.
>
> https://github.com/stumbleupon/asynchbase
>

I certainly would. Even more, i already read the code there  just a
bit although not enough to understand where the efficiency comes from.
Do you actually implement another version of RPC on non-blocking
sockets there?

Unfortunately i can't do much more than just 'try' any time soon as
the codebase is rather tightly coupled with current client. Migrating
code and tests would not be a one day effort. However, i certainly can
try with one test.
But if only i could solve the mysterious tcp lag in the datacenter, i
suspect my needs would be quite well covered with the standard client
as well.

Thank you very much for your help.

> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Got it . So that's why:

HBaseRPC:

protected final static ClientCache CLIENTS = new ClientCache();

Client Cache is static regardless of HConnection instances and
connection id is pretty much server address.
So i guess no external hack is possible to overcome that than.



On Fri, Apr 22, 2011 at 12:03 PM, Jean-Daniel Cryans
<jd...@apache.org> wrote:
> It's all multiplexed.
>
> J-D
>
> On Fri, Apr 22, 2011 at 11:52 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>
>>> I doubt that TCP doesn't perform well.  If you really believe so, can
>>> you provide a packet capture collected with:
>>> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
>>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Jean-Daniel Cryans <jd...@apache.org>.

It's all multiplexed.

J-D

On Fri, Apr 22, 2011 at 11:52 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>
>> I doubt that TCP doesn't perform well.  If you really believe so, can
>> you provide a packet capture collected with:
>> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
>>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

>
> I doubt that TCP doesn't perform well.  If you really believe so, can
> you provide a packet capture collected with:
> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
>

Hm. What i discovered there is that I assumed my hack at RS connection
pooling was working but it doesn't seem to be.
Even though i instantiate 3 HConnection instances and see 3 zookeeper
connections, i only see one RS connection.

There must be some trick in HConnection implementation preventing from
opening multiple RS RPC queues even that they come thru different
HConnection instances that i missed.

Re: 0.90 latency performance, cdh3b4

Posted by tsuna <ts...@gmail.com>.

On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> What doesn't seem so fast is RPC. As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should be much less). All that traffic is using
> single regionserver node rpc queue as HConnection would not open more
> than one socket to same region. And tcp doesn't seem to perform very
> well for some reason in this scenario.

I doubt that TCP doesn't perform well.  If you really believe so, can
you provide a packet capture collected with:
sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020

> So, it seems to help to actually open multiple hbase connections and
> round-robin them between scans. that way even though we waste more
> zookeeper connections, we also have more than one rpc channel open for
> the high-traffic region as well. A little coding and it brings us down
> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
> connections  Perhaps normally it is not as much a problem as traffic
> is more uniformly distributed among regions from the same client.

Would you be open to trying asynchbase in your test case /
application?  I haven't seen a case (yet) where you actually *need* to
open multiple connections per RegionServer.  I expect that if the
problem is an inefficiency in the HBase client, asynchbase might do
better, and if it does not, its DEBUG logging level might shed some
light on where the problem comes from.

https://github.com/stumbleupon/asynchbase

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

So of course this test is stupid becuase in reality nobody would scan
a table with 40 rows. So all the traffic goes to a single region
server, so with a relatively low stress we could get an idea how the
rest of the cluster would behave with proportionally higher load.

Anyway. For a million requests shot at a region server at various
speeds between 300 and 500 qps the picture is not pretty. RPC metrics
are arctually good -- no more than 1ms average per next() and 0 per
get(). So region server is lightning fast.

What doesn't seem so fast is RPC. As i reported before, i was getting
25ms TTLB under the circumstances. In this case all the traffic to the
node goes thru same client (but in reality of course the node's
portion per client should be much less). All that traffic is using
single regionserver node rpc queue as HConnection would not open more
than one socket to same region. And tcp doesn't seem to perform very
well for some reason in this scenario.

So, it seems to help to actually open multiple hbase connections and
round-robin them between scans. that way even though we waste more
zookeeper connections, we also have more than one rpc channel open for
the high-traffic region as well. A little coding and it brings us down
from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
connections  Perhaps normally it is not as much a problem as traffic
is more uniformly distributed among regions from the same client.

The next thing i did was to enable tcp_nodelay on both client and
server. That got us down even more to 13ms average.

However, it is still about two times slower if i run all processes at
the same machine (i get around 6-7ms average TTLBs for the same type
of scan).

Ping time for about same packet size between hosts involved seems to
revolve around 1ms. Where another 5ms average time are getting lost is
still a mystery. But oh well i guess it is as good as it gets.
In real life hbase applications traffic would be much more uniformly
distributed among regions and this would be much less of an issue
perhaps.

I also suspect that using udp for short scans and gets might reduce
latency a bit as well.

On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> So i can't seem to be able to immediately find the explanation for those metrics
>
> - rpcQueueTime -- do I assume it correctly it's the time a request
> sits waiting int the incoming rpc queue before being picked up by
> handler ?
>
> -rpcProcessingTime -- do i assume it correctly it's time of request
> being processed by region server's handler?
>
> So inner time to last byte should be approximately sum of those, right?
>
> Thanks.
> -Dmitriy
>
> On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> Yes that's what i said. there's metric for fs latency but we are not
>> hitting it so it's not useful.
>>
>> Question is which one might be useful to measure inner ttlb, and i
>> don't see it there.
>>
>> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com> wrote:
>>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>>>
>>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>>>
>>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>>>>
>>>> There are fs latency metrics there but nothing for the respons times.
>>>> fs latency is essentially hdfs latency AFAICT and that would not be
>>>> relevant to what i am asking for (for as long as we are hitting LRU
>>>> block cache anyway). we are not hitting fs.
>>>>
>>>> Unless there are more metrics than listed in the Hbase Book?
>>>>
>>>>
>>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>>>> > on hbase home page.
>>>> >
>>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>>> wrote:
>>>> >> Is there any way to log 'inner' TTLB times the region server incurs for
>>>> reads?
>>>> >>
>>>> >>
>>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>>> wrote:
>>>> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
>>>> >>> region server... so far not much except for LRUBlock cache spitting
>>>> >>> metrics ..
>>>> >>>
>>>> >>> 2011-04-20 12:28:48,375 DEBUG
>>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>>>> >>> evicted=0, evictedPerRun=NaN
>>>> >>> 2011-04-20 12:33:48,375 DEBUG
>>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>>>> >>> evicted=0, evictedPerRun=NaN
>>>> >>> 2011-04-20 12:38:48,375 DEBUG
>>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>>>> >>> evicted=0, evictedPerRun=NaN
>>>> >>>
>>>> >>>
>>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>>>> >>>> If one region only, then its located on a single regionserver.  Tail
>>>> >>>> that regionservers logs.  It might tell us something.
>>>> >>>> St.Ack
>>>> >>>>
>>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>>>> dlyubimov@apache.org> wrote:
>>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>>> >>>>>>> are almost empty and in-memory, so they surely should fit in those
>>>> 40%
>>>> >>>>>>> heap dedicated to them.
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>> How many clients are going against the cluster?  If you use less, do
>>>> >>>>>> your numbers improve?
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>> And all these clients are going against a single 40 row table?
>>>> >>>>> St.Ack
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

So i can't seem to be able to immediately find the explanation for those metrics

- rpcQueueTime -- do I assume it correctly it's the time a request
sits waiting int the incoming rpc queue before being picked up by
handler ?

-rpcProcessingTime -- do i assume it correctly it's time of request
being processed by region server's handler?

So inner time to last byte should be approximately sum of those, right?

Thanks.
-Dmitriy

On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Yes that's what i said. there's metric for fs latency but we are not
> hitting it so it's not useful.
>
> Question is which one might be useful to measure inner ttlb, and i
> don't see it there.
>
> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com> wrote:
>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>>
>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>>
>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>>>
>>> There are fs latency metrics there but nothing for the respons times.
>>> fs latency is essentially hdfs latency AFAICT and that would not be
>>> relevant to what i am asking for (for as long as we are hitting LRU
>>> block cache anyway). we are not hitting fs.
>>>
>>> Unless there are more metrics than listed in the Hbase Book?
>>>
>>>
>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>>> > on hbase home page.
>>> >
>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>> >> Is there any way to log 'inner' TTLB times the region server incurs for
>>> reads?
>>> >>
>>> >>
>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
>>> >>> region server... so far not much except for LRUBlock cache spitting
>>> >>> metrics ..
>>> >>>
>>> >>> 2011-04-20 12:28:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>> 2011-04-20 12:33:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>> 2011-04-20 12:38:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>>
>>> >>>
>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>>> >>>> If one region only, then its located on a single regionserver.  Tail
>>> >>>> that regionservers logs.  It might tell us something.
>>> >>>> St.Ack
>>> >>>>
>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>>> dlyubimov@apache.org> wrote:
>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>> >>>>>>> are almost empty and in-memory, so they surely should fit in those
>>> 40%
>>> >>>>>>> heap dedicated to them.
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>> How many clients are going against the cluster?  If you use less, do
>>> >>>>>> your numbers improve?
>>> >>>>>>
>>> >>>>>
>>> >>>>> And all these clients are going against a single 40 row table?
>>> >>>>> St.Ack
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Yes that's what i said. there's metric for fs latency but we are not
hitting it so it's not useful.

Question is which one might be useful to measure inner ttlb, and i
don't see it there.

On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com> wrote:
> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>
> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>>
>> There are fs latency metrics there but nothing for the respons times.
>> fs latency is essentially hdfs latency AFAICT and that would not be
>> relevant to what i am asking for (for as long as we are hitting LRU
>> block cache anyway). we are not hitting fs.
>>
>> Unless there are more metrics than listed in the Hbase Book?
>>
>>
>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>> > on hbase home page.
>> >
>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >> Is there any way to log 'inner' TTLB times the region server incurs for
>> reads?
>> >>
>> >>
>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
>> >>> region server... so far not much except for LRUBlock cache spitting
>> >>> metrics ..
>> >>>
>> >>> 2011-04-20 12:28:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>> 2011-04-20 12:33:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>> 2011-04-20 12:38:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>>
>> >>>
>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>> >>>> If one region only, then its located on a single regionserver.  Tail
>> >>>> that regionservers logs.  It might tell us something.
>> >>>> St.Ack
>> >>>>
>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>> dlyubimov@apache.org> wrote:
>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>> >>>>>>> are almost empty and in-memory, so they surely should fit in those
>> 40%
>> >>>>>>> heap dedicated to them.
>> >>>>>>>
>> >>>>>>
>> >>>>>> How many clients are going against the cluster?  If you use less, do
>> >>>>>> your numbers improve?
>> >>>>>>
>> >>>>>
>> >>>>> And all these clients are going against a single 40 row table?
>> >>>>> St.Ack
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

Yes.  In the sense that it measures time until operation is complete
according to the client.

And assuming that TTLB = time to last bit.

YCSB is, however, a frail vessel.  I have been unable to stress even
moderate sized clusters with it.  It is fine
as a starting point.

On Wed, Apr 20, 2011 at 1:38 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> btw, Ted, your version of YCSB in github should show TTLBs, right?
>
> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com> wrote:
>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>>
>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>>
>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>>>
>>> There are fs latency metrics there but nothing for the respons times.
>>> fs latency is essentially hdfs latency AFAICT and that would not be
>>> relevant to what i am asking for (for as long as we are hitting LRU
>>> block cache anyway). we are not hitting fs.
>>>
>>> Unless there are more metrics than listed in the Hbase Book?
>>>
>>>
>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>>> > on hbase home page.
>>> >
>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>> >> Is there any way to log 'inner' TTLB times the region server incurs for
>>> reads?
>>> >>
>>> >>
>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
>>> >>> region server... so far not much except for LRUBlock cache spitting
>>> >>> metrics ..
>>> >>>
>>> >>> 2011-04-20 12:28:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>> 2011-04-20 12:33:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>> 2011-04-20 12:38:48,375 DEBUG
>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>>> >>> evicted=0, evictedPerRun=NaN
>>> >>>
>>> >>>
>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>>> >>>> If one region only, then its located on a single regionserver.  Tail
>>> >>>> that regionservers logs.  It might tell us something.
>>> >>>> St.Ack
>>> >>>>
>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>>> dlyubimov@apache.org> wrote:
>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>> >>>>>>> are almost empty and in-memory, so they surely should fit in those
>>> 40%
>>> >>>>>>> heap dedicated to them.
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>> How many clients are going against the cluster?  If you use less, do
>>> >>>>>> your numbers improve?
>>> >>>>>>
>>> >>>>>
>>> >>>>> And all these clients are going against a single 40 row table?
>>> >>>>> St.Ack
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

btw, Ted, your version of YCSB in github should show TTLBs, right?

On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <td...@maprtech.com> wrote:
> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>
> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>>
>> There are fs latency metrics there but nothing for the respons times.
>> fs latency is essentially hdfs latency AFAICT and that would not be
>> relevant to what i am asking for (for as long as we are hitting LRU
>> block cache anyway). we are not hitting fs.
>>
>> Unless there are more metrics than listed in the Hbase Book?
>>
>>
>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>> > on hbase home page.
>> >
>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >> Is there any way to log 'inner' TTLB times the region server incurs for
>> reads?
>> >>
>> >>
>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
>> >>> region server... so far not much except for LRUBlock cache spitting
>> >>> metrics ..
>> >>>
>> >>> 2011-04-20 12:28:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>> 2011-04-20 12:33:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>> 2011-04-20 12:38:48,375 DEBUG
>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>> >>> evicted=0, evictedPerRun=NaN
>> >>>
>> >>>
>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>> >>>> If one region only, then its located on a single regionserver.  Tail
>> >>>> that regionservers logs.  It might tell us something.
>> >>>> St.Ack
>> >>>>
>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>> dlyubimov@apache.org> wrote:
>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>> >>>>>>> are almost empty and in-memory, so they surely should fit in those
>> 40%
>> >>>>>>> heap dedicated to them.
>> >>>>>>>
>> >>>>>>
>> >>>>>> How many clients are going against the cluster?  If you use less, do
>> >>>>>> your numbers improve?
>> >>>>>>
>> >>>>>
>> >>>>> And all these clients are going against a single 40 row table?
>> >>>>> St.Ack
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

FS latency shouldn't matter with your 99.9% cache hit rate as reported.

On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> Yes -- I already looked thru 'regionserver' metrics some time ago in
> hbase book. And i am not sure there's a 'inner ttlb' metric.
>
> There are fs latency metrics there but nothing for the respons times.
> fs latency is essentially hdfs latency AFAICT and that would not be
> relevant to what i am asking for (for as long as we are hitting LRU
> block cache anyway). we are not hitting fs.
>
> Unless there are more metrics than listed in the Hbase Book?
>
>
> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
> > Enable rpc logging.  Will show in your ganglia.  See metrics article
> > on hbase home page.
> >
> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >> Is there any way to log 'inner' TTLB times the region server incurs for
> reads?
> >>
> >>
> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >>> i just enabled debug logging for o.a.h.hbase logger in that particular
> >>> region server... so far not much except for LRUBlock cache spitting
> >>> metrics ..
> >>>
> >>> 2011-04-20 12:28:48,375 DEBUG
> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
> >>> evicted=0, evictedPerRun=NaN
> >>> 2011-04-20 12:33:48,375 DEBUG
> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
> >>> evicted=0, evictedPerRun=NaN
> >>> 2011-04-20 12:38:48,375 DEBUG
> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
> >>> evicted=0, evictedPerRun=NaN
> >>>
> >>>
> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
> >>>> If one region only, then its located on a single regionserver.  Tail
> >>>> that regionservers logs.  It might tell us something.
> >>>> St.Ack
> >>>>
> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
> dlyubimov@apache.org> wrote:
> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
> >>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
> >>>>>>> are almost empty and in-memory, so they surely should fit in those
> 40%
> >>>>>>> heap dedicated to them.
> >>>>>>>
> >>>>>>
> >>>>>> How many clients are going against the cluster?  If you use less, do
> >>>>>> your numbers improve?
> >>>>>>
> >>>>>
> >>>>> And all these clients are going against a single 40 row table?
> >>>>> St.Ack
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Yes -- I already looked thru 'regionserver' metrics some time ago in
hbase book. And i am not sure there's a 'inner ttlb' metric.

There are fs latency metrics there but nothing for the respons times.
fs latency is essentially hdfs latency AFAICT and that would not be
relevant to what i am asking for (for as long as we are hitting LRU
block cache anyway). we are not hitting fs.

Unless there are more metrics than listed in the Hbase Book?


On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
> Enable rpc logging.  Will show in your ganglia.  See metrics article
> on hbase home page.
>
> On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> Is there any way to log 'inner' TTLB times the region server incurs for reads?
>>
>>
>> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> i just enabled debug logging for o.a.h.hbase logger in that particular
>>> region server... so far not much except for LRUBlock cache spitting
>>> metrics ..
>>>
>>> 2011-04-20 12:28:48,375 DEBUG
>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>>> evicted=0, evictedPerRun=NaN
>>> 2011-04-20 12:33:48,375 DEBUG
>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>>> evicted=0, evictedPerRun=NaN
>>> 2011-04-20 12:38:48,375 DEBUG
>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>>> evicted=0, evictedPerRun=NaN
>>>
>>>
>>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>>>> If one region only, then its located on a single regionserver.  Tail
>>>> that regionservers logs.  It might tell us something.
>>>> St.Ack
>>>>
>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>>>>>> are almost empty and in-memory, so they surely should fit in those 40%
>>>>>>> heap dedicated to them.
>>>>>>>
>>>>>>
>>>>>> How many clients are going against the cluster?  If you use less, do
>>>>>> your numbers improve?
>>>>>>
>>>>>
>>>>> And all these clients are going against a single 40 row table?
>>>>> St.Ack
>>>>>
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

Enable rpc logging.  Will show in your ganglia.  See metrics article
on hbase home page.

On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Is there any way to log 'inner' TTLB times the region server incurs for reads?
>
>
> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> i just enabled debug logging for o.a.h.hbase logger in that particular
>> region server... so far not much except for LRUBlock cache spitting
>> metrics ..
>>
>> 2011-04-20 12:28:48,375 DEBUG
>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>> evicted=0, evictedPerRun=NaN
>> 2011-04-20 12:33:48,375 DEBUG
>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>> evicted=0, evictedPerRun=NaN
>> 2011-04-20 12:38:48,375 DEBUG
>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>> evicted=0, evictedPerRun=NaN
>>
>>
>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>>> If one region only, then its located on a single regionserver.  Tail
>>> that regionservers logs.  It might tell us something.
>>> St.Ack
>>>
>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>>>>> are almost empty and in-memory, so they surely should fit in those 40%
>>>>>> heap dedicated to them.
>>>>>>
>>>>>
>>>>> How many clients are going against the cluster?  If you use less, do
>>>>> your numbers improve?
>>>>>
>>>>
>>>> And all these clients are going against a single 40 row table?
>>>> St.Ack
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Is there any way to log 'inner' TTLB times the region server incurs for reads?


On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> i just enabled debug logging for o.a.h.hbase logger in that particular
> region server... so far not much except for LRUBlock cache spitting
> metrics ..
>
> 2011-04-20 12:28:48,375 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2011-04-20 12:33:48,375 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2011-04-20 12:38:48,375 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
> evicted=0, evictedPerRun=NaN
>
>
> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
>> If one region only, then its located on a single regionserver.  Tail
>> that regionservers logs.  It might tell us something.
>> St.Ack
>>
>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>>>> are almost empty and in-memory, so they surely should fit in those 40%
>>>>> heap dedicated to them.
>>>>>
>>>>
>>>> How many clients are going against the cluster?  If you use less, do
>>>> your numbers improve?
>>>>
>>>
>>> And all these clients are going against a single 40 row table?
>>> St.Ack
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

i just enabled debug logging for o.a.h.hbase logger in that particular
region server... so far not much except for LRUBlock cache spitting
metrics ..

2011-04-20 12:28:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
evicted=0, evictedPerRun=NaN
2011-04-20 12:33:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
evicted=0, evictedPerRun=NaN
2011-04-20 12:38:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
evicted=0, evictedPerRun=NaN


On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
> If one region only, then its located on a single regionserver.  Tail
> that regionservers logs.  It might tell us something.
> St.Ack
>
> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>>> Right now i am shooting scans returning between 3 and 40 rows and
>>>> regardless of data size, approximately 500-400 QPS. The data tables
>>>> are almost empty and in-memory, so they surely should fit in those 40%
>>>> heap dedicated to them.
>>>>
>>>
>>> How many clients are going against the cluster?  If you use less, do
>>> your numbers improve?
>>>
>>
>> And all these clients are going against a single 40 row table?
>> St.Ack
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

If one region only, then its located on a single regionserver.  Tail
that regionservers logs.  It might tell us something.
St.Ack

On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>>> Right now i am shooting scans returning between 3 and 40 rows and
>>> regardless of data size, approximately 500-400 QPS. The data tables
>>> are almost empty and in-memory, so they surely should fit in those 40%
>>> heap dedicated to them.
>>>
>>
>> How many clients are going against the cluster?  If you use less, do
>> your numbers improve?
>>
>
> And all these clients are going against a single 40 row table?
> St.Ack
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net> wrote:
> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
>> Right now i am shooting scans returning between 3 and 40 rows and
>> regardless of data size, approximately 500-400 QPS. The data tables
>> are almost empty and in-memory, so they surely should fit in those 40%
>> heap dedicated to them.
>>
>
> How many clients are going against the cluster?  If you use less, do
> your numbers improve?
>

And all these clients are going against a single 40 row table?
St.Ack

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
> Right now i am shooting scans returning between 3 and 40 rows and
> regardless of data size, approximately 500-400 QPS. The data tables
> are almost empty and in-memory, so they surely should fit in those 40%
> heap dedicated to them.
>

How many clients are going against the cluster?  If you use less, do
your numbers improve?

St.Ack

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

also we had another cluster running previous CDH versions with
pre-0.89 hbase and the latencies weren't as nearly as bad.

On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> for this test, there's just no more than 40 rows in every given table.
>> This is just a laugh check.
>>
>> so i think it's safe to assume it all goes to same region server.
>>
>> But latency would not depend on which server call is going to, would
>> it? Only throughput would, assuming we are not overloading.
>>
>> And we clearly are not as my single-node local version runs quite ok
>> response times with the same throughput.
>>
>> It's something with either client connections or network latency or
>> ... i don't know what it is. I did not set up the cluster but i gotta
>> troubleshoot it now :)
>>
>>
>>
>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>> How many regions?  How are they distributed?
>>>
>>> Typically it is good to fill the table some what and then drive some
>>> splits and balance operations via the shell.  One more split to make
>>> the regions be local and you should be good to go.  Make sure you have
>>> enough keys in the table to support these splits, of course.
>>>
>>> Under load, you can look at the hbase home page to see how
>>> transactions are spread around your cluster.  Without splits and local
>>> region files, you aren't going to see what you want in terms of
>>> performance.
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

8% percentile. sorry. 8% requests do land under 3ms or less.

On Wed, Apr 20, 2011 at 12:06 PM, Ted Dunning <td...@maprtech.com> wrote:
> What is meant by 8% quartile?  75th %-ile?  98%-ile?  Should quartile have
> been quantile?
>
> On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> Ok actually we do have 1 region for these exact tables... so back to
>> square one.
>>
>> FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically
>> sound it seems. question is why outliers spread is so much longer than
>> in tests on one machine. must be network. What else.
>>
>>
>> On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> > Got it. This must be the reason. Cause it is a laugh check, and i do
>> > see 6 regions for 40 rows so it can span them, although i can't
>> > confirm it for sure. It may be due to how table was set up or due to
>> > some time running them and rotating some data there. The uniformly
>> > distributed hashes are used for the keys so that it is totally
>> > plausible 40 rows will get into 6 different regions.
>> >
>> > Ok i'll take it for working theory for now.
>> >
>> > Is there a way to set max # of regions per table? I guess the method
>> > in the manual is to set max region size. Which means i probably need
>> > to re-create the table with one region to get back to 1 region? or
>> > maybe there's a way to get it back to one region without recreating
>> > it, such as major compaction?
>> >
>> > thanks.
>> > -d
>> >
>> > On Wed, Apr 20, 2011 at 9:55 AM, Stack <st...@duboce.net> wrote:
>> >> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >>> Ok. Let me ask a question.
>> >>>
>> >>> When scan is performed and it obviously covers several regions, are
>> >>> scan performance calls done in sinchronous succession or they are done
>> >>> in parallel?
>> >>>
>> >>
>> >> The former.
>> >>
>> >>
>> >>> Assuming scan is returning 40 results but for some weird reason it
>> >>> goes to 6 regions and caching is set to 100 (so it can take all of
>> >>> them) are individual region request latencies summed or it would be
>> >>> max(region request latency)?
>> >>>
>> >>
>> >> Summed.
>> >>
>> >> The 40 rows are not contiguous in the same region?  If not, the cost
>> >> of client setting up new scanner against next region will be inline w/
>> >> your read timing (at least an rpc per region).
>> >>
>> >> St.Ack
>> >>
>> >>> Thank you very much.
>> >>> -D
>> >>>
>> >>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com>
>> wrote:
>> >>>> For a tiny test like this, everything should be in memory and latency
>> >>>> should be very low.
>> >>>>
>> >>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >>>>> PS so what should latency be for reads in 0.90, assuming moderate
>> thruput?
>> >>>>>
>> >>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> >>>>>> for this test, there's just no more than 40 rows in every given
>> table.
>> >>>>>> This is just a laugh check.
>> >>>>>>
>> >>>>>> so i think it's safe to assume it all goes to same region server.
>> >>>>>>
>> >>>>>> But latency would not depend on which server call is going to, would
>> >>>>>> it? Only throughput would, assuming we are not overloading.
>> >>>>>>
>> >>>>>> And we clearly are not as my single-node local version runs quite ok
>> >>>>>> response times with the same throughput.
>> >>>>>>
>> >>>>>> It's something with either client connections or network latency or
>> >>>>>> ... i don't know what it is. I did not set up the cluster but i
>> gotta
>> >>>>>> troubleshoot it now :)
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com>
>> wrote:
>> >>>>>>> How many regions?  How are they distributed?
>> >>>>>>>
>> >>>>>>> Typically it is good to fill the table some what and then drive
>> some
>> >>>>>>> splits and balance operations via the shell.  One more split to
>> make
>> >>>>>>> the regions be local and you should be good to go.  Make sure you
>> have
>> >>>>>>> enough keys in the table to support these splits, of course.
>> >>>>>>>
>> >>>>>>> Under load, you can look at the hbase home page to see how
>> >>>>>>> transactions are spread around your cluster.  Without splits and
>> local
>> >>>>>>> region files, you aren't going to see what you want in terms of
>> >>>>>>> performance.
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

What is meant by 8% quartile?  75th %-ile?  98%-ile?  Should quartile have
been quantile?

On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> Ok actually we do have 1 region for these exact tables... so back to
> square one.
>
> FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically
> sound it seems. question is why outliers spread is so much longer than
> in tests on one machine. must be network. What else.
>
>
> On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> > Got it. This must be the reason. Cause it is a laugh check, and i do
> > see 6 regions for 40 rows so it can span them, although i can't
> > confirm it for sure. It may be due to how table was set up or due to
> > some time running them and rotating some data there. The uniformly
> > distributed hashes are used for the keys so that it is totally
> > plausible 40 rows will get into 6 different regions.
> >
> > Ok i'll take it for working theory for now.
> >
> > Is there a way to set max # of regions per table? I guess the method
> > in the manual is to set max region size. Which means i probably need
> > to re-create the table with one region to get back to 1 region? or
> > maybe there's a way to get it back to one region without recreating
> > it, such as major compaction?
> >
> > thanks.
> > -d
> >
> > On Wed, Apr 20, 2011 at 9:55 AM, Stack <st...@duboce.net> wrote:
> >> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >>> Ok. Let me ask a question.
> >>>
> >>> When scan is performed and it obviously covers several regions, are
> >>> scan performance calls done in sinchronous succession or they are done
> >>> in parallel?
> >>>
> >>
> >> The former.
> >>
> >>
> >>> Assuming scan is returning 40 results but for some weird reason it
> >>> goes to 6 regions and caching is set to 100 (so it can take all of
> >>> them) are individual region request latencies summed or it would be
> >>> max(region request latency)?
> >>>
> >>
> >> Summed.
> >>
> >> The 40 rows are not contiguous in the same region?  If not, the cost
> >> of client setting up new scanner against next region will be inline w/
> >> your read timing (at least an rpc per region).
> >>
> >> St.Ack
> >>
> >>> Thank you very much.
> >>> -D
> >>>
> >>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com>
> wrote:
> >>>> For a tiny test like this, everything should be in memory and latency
> >>>> should be very low.
> >>>>
> >>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >>>>> PS so what should latency be for reads in 0.90, assuming moderate
> thruput?
> >>>>>
> >>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >>>>>> for this test, there's just no more than 40 rows in every given
> table.
> >>>>>> This is just a laugh check.
> >>>>>>
> >>>>>> so i think it's safe to assume it all goes to same region server.
> >>>>>>
> >>>>>> But latency would not depend on which server call is going to, would
> >>>>>> it? Only throughput would, assuming we are not overloading.
> >>>>>>
> >>>>>> And we clearly are not as my single-node local version runs quite ok
> >>>>>> response times with the same throughput.
> >>>>>>
> >>>>>> It's something with either client connections or network latency or
> >>>>>> ... i don't know what it is. I did not set up the cluster but i
> gotta
> >>>>>> troubleshoot it now :)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com>
> wrote:
> >>>>>>> How many regions?  How are they distributed?
> >>>>>>>
> >>>>>>> Typically it is good to fill the table some what and then drive
> some
> >>>>>>> splits and balance operations via the shell.  One more split to
> make
> >>>>>>> the regions be local and you should be good to go.  Make sure you
> have
> >>>>>>> enough keys in the table to support these splits, of course.
> >>>>>>>
> >>>>>>> Under load, you can look at the hbase home page to see how
> >>>>>>> transactions are spread around your cluster.  Without splits and
> local
> >>>>>>> region files, you aren't going to see what you want in terms of
> >>>>>>> performance.
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Ok actually we do have 1 region for these exact tables... so back to
square one.

FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically
sound it seems. question is why outliers spread is so much longer than
in tests on one machine. must be network. What else.


On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Got it. This must be the reason. Cause it is a laugh check, and i do
> see 6 regions for 40 rows so it can span them, although i can't
> confirm it for sure. It may be due to how table was set up or due to
> some time running them and rotating some data there. The uniformly
> distributed hashes are used for the keys so that it is totally
> plausible 40 rows will get into 6 different regions.
>
> Ok i'll take it for working theory for now.
>
> Is there a way to set max # of regions per table? I guess the method
> in the manual is to set max region size. Which means i probably need
> to re-create the table with one region to get back to 1 region? or
> maybe there's a way to get it back to one region without recreating
> it, such as major compaction?
>
> thanks.
> -d
>
> On Wed, Apr 20, 2011 at 9:55 AM, Stack <st...@duboce.net> wrote:
>> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> Ok. Let me ask a question.
>>>
>>> When scan is performed and it obviously covers several regions, are
>>> scan performance calls done in sinchronous succession or they are done
>>> in parallel?
>>>
>>
>> The former.
>>
>>
>>> Assuming scan is returning 40 results but for some weird reason it
>>> goes to 6 regions and caching is set to 100 (so it can take all of
>>> them) are individual region request latencies summed or it would be
>>> max(region request latency)?
>>>
>>
>> Summed.
>>
>> The 40 rows are not contiguous in the same region?  If not, the cost
>> of client setting up new scanner against next region will be inline w/
>> your read timing (at least an rpc per region).
>>
>> St.Ack
>>
>>> Thank you very much.
>>> -D
>>>
>>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com> wrote:
>>>> For a tiny test like this, everything should be in memory and latency
>>>> should be very low.
>>>>
>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>>> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>>>>>
>>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>>>> for this test, there's just no more than 40 rows in every given table.
>>>>>> This is just a laugh check.
>>>>>>
>>>>>> so i think it's safe to assume it all goes to same region server.
>>>>>>
>>>>>> But latency would not depend on which server call is going to, would
>>>>>> it? Only throughput would, assuming we are not overloading.
>>>>>>
>>>>>> And we clearly are not as my single-node local version runs quite ok
>>>>>> response times with the same throughput.
>>>>>>
>>>>>> It's something with either client connections or network latency or
>>>>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>>>>> troubleshoot it now :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>>>>>> How many regions?  How are they distributed?
>>>>>>>
>>>>>>> Typically it is good to fill the table some what and then drive some
>>>>>>> splits and balance operations via the shell.  One more split to make
>>>>>>> the regions be local and you should be good to go.  Make sure you have
>>>>>>> enough keys in the table to support these splits, of course.
>>>>>>>
>>>>>>> Under load, you can look at the hbase home page to see how
>>>>>>> transactions are spread around your cluster.  Without splits and local
>>>>>>> region files, you aren't going to see what you want in terms of
>>>>>>> performance.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Got it. This must be the reason. Cause it is a laugh check, and i do
> see 6 regions for 40 rows so it can span them, although i can't
> confirm it for sure.

Run a scan from the shell:

  hbase> scan 'YOUR_TABLE'

... and you should see the setup of the scanner as it crosses regions
(You may have to exit shell, enable logging for the client in
log4j.properties, and then retry in the shell -- but I believe it on
by default).

> Is there a way to set max # of regions per table? I guess the method
> in the manual is to set max region size. Which means i probably need
> to re-create the table with one region to get back to 1 region? or
> maybe there's a way to get it back to one region without recreating
> it, such as major compaction?
>

If 40 rows in it, just disable and drop it and then recreate.

There is http://hbase.apache.org/book.html#hbase.regionserver.regionSplitLimit
for setting maximum count of regions.

St.Ack

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Got it. This must be the reason. Cause it is a laugh check, and i do
see 6 regions for 40 rows so it can span them, although i can't
confirm it for sure. It may be due to how table was set up or due to
some time running them and rotating some data there. The uniformly
distributed hashes are used for the keys so that it is totally
plausible 40 rows will get into 6 different regions.

Ok i'll take it for working theory for now.

Is there a way to set max # of regions per table? I guess the method
in the manual is to set max region size. Which means i probably need
to re-create the table with one region to get back to 1 region? or
maybe there's a way to get it back to one region without recreating
it, such as major compaction?

thanks.
-d

On Wed, Apr 20, 2011 at 9:55 AM, Stack <st...@duboce.net> wrote:
> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> Ok. Let me ask a question.
>>
>> When scan is performed and it obviously covers several regions, are
>> scan performance calls done in sinchronous succession or they are done
>> in parallel?
>>
>
> The former.
>
>
>> Assuming scan is returning 40 results but for some weird reason it
>> goes to 6 regions and caching is set to 100 (so it can take all of
>> them) are individual region request latencies summed or it would be
>> max(region request latency)?
>>
>
> Summed.
>
> The 40 rows are not contiguous in the same region?  If not, the cost
> of client setting up new scanner against next region will be inline w/
> your read timing (at least an rpc per region).
>
> St.Ack
>
>> Thank you very much.
>> -D
>>
>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com> wrote:
>>> For a tiny test like this, everything should be in memory and latency
>>> should be very low.
>>>
>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>>>>
>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>>> for this test, there's just no more than 40 rows in every given table.
>>>>> This is just a laugh check.
>>>>>
>>>>> so i think it's safe to assume it all goes to same region server.
>>>>>
>>>>> But latency would not depend on which server call is going to, would
>>>>> it? Only throughput would, assuming we are not overloading.
>>>>>
>>>>> And we clearly are not as my single-node local version runs quite ok
>>>>> response times with the same throughput.
>>>>>
>>>>> It's something with either client connections or network latency or
>>>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>>>> troubleshoot it now :)
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>>>>> How many regions?  How are they distributed?
>>>>>>
>>>>>> Typically it is good to fill the table some what and then drive some
>>>>>> splits and balance operations via the shell.  One more split to make
>>>>>> the regions be local and you should be good to go.  Make sure you have
>>>>>> enough keys in the table to support these splits, of course.
>>>>>>
>>>>>> Under load, you can look at the hbase home page to see how
>>>>>> transactions are spread around your cluster.  Without splits and local
>>>>>> region files, you aren't going to see what you want in terms of
>>>>>> performance.
>>>>>>
>>>>>
>>>>
>>>
>>
>

RE: Restarting a Region Server

Posted by Peter Haidinyak <ph...@local.com>.

Thanks

-Pete

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Wednesday, April 20, 2011 10:23 AM
To: user@hbase.apache.org
Subject: Re: Restarting a Region Server

On the host that is carrying the regionserver do:

  ./bin/hbase-daemon.sh stop regionserver

Then start it again.

Or, since 0.90.2, see bin/graceful_stop.sh.  It will let you do a
gradual decommission optionally restarting the regionserver after the
regions have been gently offloaded and then optionally again after the
server comes back, moving its old region burden back on to the
regionserver to preserve locality.  See
http://hbase.apache.org/book.html#decommission for more.

St.Ack

On Wed, Apr 20, 2011 at 10:03 AM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>   I just bounced a region server. How do I start just this one region server and make sure it rejoins the cluster?
>
> Thanks
>
> -Pete
>

Re: Restarting a Region Server

Posted by Stack <st...@duboce.net>.

On the host that is carrying the regionserver do:

  ./bin/hbase-daemon.sh stop regionserver

Then start it again.

Or, since 0.90.2, see bin/graceful_stop.sh.  It will let you do a
gradual decommission optionally restarting the regionserver after the
regions have been gently offloaded and then optionally again after the
server comes back, moving its old region burden back on to the
regionserver to preserve locality.  See
http://hbase.apache.org/book.html#decommission for more.

St.Ack

On Wed, Apr 20, 2011 at 10:03 AM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>   I just bounced a region server. How do I start just this one region server and make sure it rejoins the cluster?
>
> Thanks
>
> -Pete
>

Restarting a Region Server

Posted by Peter Haidinyak <ph...@local.com>.

Hi,
   I just bounced a region server. How do I start just this one region server and make sure it rejoins the cluster?

Thanks

-Pete

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Ok. Let me ask a question.
>
> When scan is performed and it obviously covers several regions, are
> scan performance calls done in sinchronous succession or they are done
> in parallel?
>

The former.


> Assuming scan is returning 40 results but for some weird reason it
> goes to 6 regions and caching is set to 100 (so it can take all of
> them) are individual region request latencies summed or it would be
> max(region request latency)?
>

Summed.

The 40 rows are not contiguous in the same region?  If not, the cost
of client setting up new scanner against next region will be inline w/
your read timing (at least an rpc per region).

St.Ack

> Thank you very much.
> -D
>
> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com> wrote:
>> For a tiny test like this, everything should be in memory and latency
>> should be very low.
>>
>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>>>
>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>> for this test, there's just no more than 40 rows in every given table.
>>>> This is just a laugh check.
>>>>
>>>> so i think it's safe to assume it all goes to same region server.
>>>>
>>>> But latency would not depend on which server call is going to, would
>>>> it? Only throughput would, assuming we are not overloading.
>>>>
>>>> And we clearly are not as my single-node local version runs quite ok
>>>> response times with the same throughput.
>>>>
>>>> It's something with either client connections or network latency or
>>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>>> troubleshoot it now :)
>>>>
>>>>
>>>>
>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>>>> How many regions?  How are they distributed?
>>>>>
>>>>> Typically it is good to fill the table some what and then drive some
>>>>> splits and balance operations via the shell.  One more split to make
>>>>> the regions be local and you should be good to go.  Make sure you have
>>>>> enough keys in the table to support these splits, of course.
>>>>>
>>>>> Under load, you can look at the hbase home page to see how
>>>>> transactions are spread around your cluster.  Without splits and local
>>>>> region files, you aren't going to see what you want in terms of
>>>>> performance.
>>>>>
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Ok. Let me ask a question.

When scan is performed and it obviously covers several regions, are
scan performance calls done in sinchronous succession or they are done
in parallel?

Assuming scan is returning 40 results but for some weird reason it
goes to 6 regions and caching is set to 100 (so it can take all of
them) are individual region request latencies summed or it would be
max(region request latency)?

Thank you very much.
-D

On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <td...@maprtech.com> wrote:
> For a tiny test like this, everything should be in memory and latency
> should be very low.
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>>
>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> for this test, there's just no more than 40 rows in every given table.
>>> This is just a laugh check.
>>>
>>> so i think it's safe to assume it all goes to same region server.
>>>
>>> But latency would not depend on which server call is going to, would
>>> it? Only throughput would, assuming we are not overloading.
>>>
>>> And we clearly are not as my single-node local version runs quite ok
>>> response times with the same throughput.
>>>
>>> It's something with either client connections or network latency or
>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>> troubleshoot it now :)
>>>
>>>
>>>
>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>>> How many regions?  How are they distributed?
>>>>
>>>> Typically it is good to fill the table some what and then drive some
>>>> splits and balance operations via the shell.  One more split to make
>>>> the regions be local and you should be good to go.  Make sure you have
>>>> enough keys in the table to support these splits, of course.
>>>>
>>>> Under load, you can look at the hbase home page to see how
>>>> transactions are spread around your cluster.  Without splits and local
>>>> region files, you aren't going to see what you want in terms of
>>>> performance.
>>>>
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by "M.Deniz OKTAR" <de...@gmail.com>.

I am having similar results but hadn't done enough testing yet.


Sent from my BlackBerry® wireless device

-----Original Message-----
From: Dmitriy Lyubimov <dl...@gmail.com>
Date: Wed, 20 Apr 2011 08:09:29 
To: <us...@hbase.apache.org>
Reply-To: user@hbase.apache.org
Subject: Re: 0.90 latency performance, cdh3b4

Yep. In all benchmarks response times for tiny data start at about 1-2ms but
not in our new setup. Which is why I am at loss where to start looking.
Seems like a network congestion but it can't be. Its a barebone setup and
admins tell me they have tested it for performance.

apologies for brevity.

Sent from my android.
-Dmitriy
On Apr 19, 2011 6:29 PM, "Ted Dunning" <td...@maprtech.com> wrote:
> For a tiny test like this, everything should be in memory and latency
> should be very low.
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:
>> PS so what should latency be for reads in 0.90, assuming moderate
thruput?
>>
>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:
>>> for this test, there's just no more than 40 rows in every given table.
>>> This is just a laugh check.
>>>
>>> so i think it's safe to assume it all goes to same region server.
>>>
>>> But latency would not depend on which server call is going to, would
>>> it? Only throughput would, assuming we are not overloading.
>>>
>>> And we clearly are not as my single-node local version runs quite ok
>>> response times with the same throughput.
>>>
>>> It's something with either client connections or network latency or
>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>> troubleshoot it now :)
>>>
>>>
>>>
>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com>
wrote:
>>>> How many regions?  How are they distributed?
>>>>
>>>> Typically it is good to fill the table some what and then drive some
>>>> splits and balance operations via the shell.  One more split to make
>>>> the regions be local and you should be good to go.  Make sure you have
>>>> enough keys in the table to support these splits, of course.
>>>>
>>>> Under load, you can look at the hbase home page to see how
>>>> transactions are spread around your cluster.  Without splits and local
>>>> region files, you aren't going to see what you want in terms of
>>>> performance.
>>>>
>>>
>>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Yep. In all benchmarks response times for tiny data start at about 1-2ms but
not in our new setup. Which is why I am at loss where to start looking.
Seems like a network congestion but it can't be. Its a barebone setup and
admins tell me they have tested it for performance.

apologies for brevity.

Sent from my android.
-Dmitriy
On Apr 19, 2011 6:29 PM, "Ted Dunning" <td...@maprtech.com> wrote:
> For a tiny test like this, everything should be in memory and latency
> should be very low.
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:
>> PS so what should latency be for reads in 0.90, assuming moderate
thruput?
>>
>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:
>>> for this test, there's just no more than 40 rows in every given table.
>>> This is just a laugh check.
>>>
>>> so i think it's safe to assume it all goes to same region server.
>>>
>>> But latency would not depend on which server call is going to, would
>>> it? Only throughput would, assuming we are not overloading.
>>>
>>> And we clearly are not as my single-node local version runs quite ok
>>> response times with the same throughput.
>>>
>>> It's something with either client connections or network latency or
>>> ... i don't know what it is. I did not set up the cluster but i gotta
>>> troubleshoot it now :)
>>>
>>>
>>>
>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com>
wrote:
>>>> How many regions?  How are they distributed?
>>>>
>>>> Typically it is good to fill the table some what and then drive some
>>>> splits and balance operations via the shell.  One more split to make
>>>> the regions be local and you should be good to go.  Make sure you have
>>>> enough keys in the table to support these splits, of course.
>>>>
>>>> Under load, you can look at the hbase home page to see how
>>>> transactions are spread around your cluster.  Without splits and local
>>>> region files, you aren't going to see what you want in terms of
>>>> performance.
>>>>
>>>
>>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

For a tiny test like this, everything should be in memory and latency
should be very low.

On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> for this test, there's just no more than 40 rows in every given table.
>> This is just a laugh check.
>>
>> so i think it's safe to assume it all goes to same region server.
>>
>> But latency would not depend on which server call is going to, would
>> it? Only throughput would, assuming we are not overloading.
>>
>> And we clearly are not as my single-node local version runs quite ok
>> response times with the same throughput.
>>
>> It's something with either client connections or network latency or
>> ... i don't know what it is. I did not set up the cluster but i gotta
>> troubleshoot it now :)
>>
>>
>>
>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>>> How many regions?  How are they distributed?
>>>
>>> Typically it is good to fill the table some what and then drive some
>>> splits and balance operations via the shell.  One more split to make
>>> the regions be local and you should be good to go.  Make sure you have
>>> enough keys in the table to support these splits, of course.
>>>
>>> Under load, you can look at the hbase home page to see how
>>> transactions are spread around your cluster.  Without splits and local
>>> region files, you aren't going to see what you want in terms of
>>> performance.
>>>
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

PS so what should latency be for reads in 0.90, assuming moderate thruput?

On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> for this test, there's just no more than 40 rows in every given table.
> This is just a laugh check.
>
> so i think it's safe to assume it all goes to same region server.
>
> But latency would not depend on which server call is going to, would
> it? Only throughput would, assuming we are not overloading.
>
> And we clearly are not as my single-node local version runs quite ok
> response times with the same throughput.
>
> It's something with either client connections or network latency or
> ... i don't know what it is. I did not set up the cluster but i gotta
> troubleshoot it now :)
>
>
>
> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
>> How many regions?  How are they distributed?
>>
>> Typically it is good to fill the table some what and then drive some
>> splits and balance operations via the shell.  One more split to make
>> the regions be local and you should be good to go.  Make sure you have
>> enough keys in the table to support these splits, of course.
>>
>> Under load, you can look at the hbase home page to see how
>> transactions are spread around your cluster.  Without splits and local
>> region files, you aren't going to see what you want in terms of
>> performance.
>>
>

Re: 0.90 latency performance, cdh3b4

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

for this test, there's just no more than 40 rows in every given table.
This is just a laugh check.

so i think it's safe to assume it all goes to same region server.

But latency would not depend on which server call is going to, would
it? Only throughput would, assuming we are not overloading.

And we clearly are not as my single-node local version runs quite ok
response times with the same throughput.

It's something with either client connections or network latency or
... i don't know what it is. I did not set up the cluster but i gotta
troubleshoot it now :)

On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <td...@maprtech.com> wrote:
> How many regions?  How are they distributed?
>
> Typically it is good to fill the table some what and then drive some
> splits and balance operations via the shell.  One more split to make
> the regions be local and you should be good to go.  Make sure you have
> enough keys in the table to support these splits, of course.
>
> Under load, you can look at the hbase home page to see how
> transactions are spread around your cluster.  Without splits and local
> region files, you aren't going to see what you want in terms of
> performance.
>

Re: 0.90 latency performance, cdh3b4

Posted by Ted Dunning <td...@maprtech.com>.

How many regions?  How are they distributed?

Typically it is good to fill the table some what and then drive some
splits and balance operations via the shell.  One more split to make
the regions be local and you should be good to go.  Make sure you have
enough keys in the table to support these splits, of course.

Under load, you can look at the hbase home page to see how
transactions are spread around your cluster.  Without splits and local
region files, you aren't going to see what you want in terms of
performance.

On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
> Hi,
>
> I would like to see how i can attack hbase performance.
>
> Right now i am shooting scans returning between 3 and 40 rows and
> regardless of data size, approximately 500-400 QPS. The data tables
> are almost empty and in-memory, so they surely should fit in those 40%
> heap dedicated to them.
>
> My local 1-node test shows read times between 1 and 2 ms. Great.
>
> As soon as i go to our 10-node cluster, the response times drop to
> 25ms per scan, regardless of # of records. I set scan block cache size
> to 100 (rows?), otherwise i was getting outrages numbers reaching as
> far out as 300-400ms.
>
> It's my understanding the timing should be actually still much closer
> to my local tests than to 25ms.
>
> So... how do i attack this ? increase regionserver handler count? What
> the latency should i be able to reach for extremely small data records
> (<=200bytes)?
>
> (CDH3b4). HBase debug logging switched off.
>
> Thanks in advance.
> -Dmitriy
>

Re: 0.90 latency performance, cdh3b4

Posted by Stack <st...@duboce.net>.

On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <dl...@apache.org> wrote:
> Right now i am shooting scans returning between 3 and 40 rows and
> regardless of data size, approximately 500-400 QPS. The data tables
> are almost empty and in-memory, so they surely should fit in those 40%
> heap dedicated to them.
>

LRU in the regionserver logs its hit rate.  You might check it.


> As soon as i go to our 10-node cluster, the response times drop to
> 25ms per scan, regardless of # of records. I set scan block cache size
> to 100 (rows?), otherwise i was getting outrages numbers reaching as
> far out as 300-400ms.
>

Otherwise you are doing an rpc per fetch.


> It's my understanding the timing should be actually still much closer
> to my local tests than to 25ms.
>

Yes.  I would think so.


> So... how do i attack this ? increase regionserver handler count? What
> the latency should i be able to reach for extremely small data records
> (<=200bytes)?
>

40 rows of < 200 bytes each coming out of cache should be fast...
around the times you are seeing for local.

St.Ack