You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vaibhav Puranik <vp...@gmail.com> on 2010/11/16 02:17:19 UTC

Correlating traffic with regions

Hi all,

We are running 0.20.6 in production.

On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%. But
the node has many tables and many regions on it.

Is there an easy way to find out which of these regions or tables are
getting most of the traffic?

Regards,
Vaibhav Purnaik
GumGum

Re: Correlating traffic with regions

Posted by Ted Yu <yu...@gmail.com>.
Right.
Stargate cluster status is centralized view. I use it to monitor the health
of our cluster by selectively querying rows on each region server.

On Thu, Nov 18, 2010 at 4:19 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Ted,
>
> I looked at /usr/bin/curl http://$server:8080/status/cluster.
>
> But there is no traffic data there. All the data this interface returns is
> already available through HBase web interface.
>
> Regards,
> Vaibhav
>
> >
>
> On Thu, Nov 18, 2010 at 10:05 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > You can query Stargate.
> > E.g.
> > /usr/bin/curl http://$server:8080/status/cluster
> >
> > You can see region information in the output.
> >
> > On Thu, Nov 18, 2010 at 9:11 AM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> >
> > > Meanwhile, I was able to roughly estimate which table is getting
> traffic
> > by
> > > executing the following commands:
> > >
> > > 1) Store ngrep output in a file (for few seconds)
> > > ngrep -W byline port 60020 > temp.out
> > >
> > > 2) Find out all the tables that region server has from HBase user
> > > interface.
> > > For each table execute the following commands:
> > > grep 'TableName,' temp.out | wc -l
> > >
> > > This was enough for us as even which table was getting hit would be
> very
> > > useful information for us. I am guessing there should be a way to grep
> > > region name too.
> > >
> > > Regards,
> > > Vaibhav,
> > > GumGum
> > >
> > >
> > >
> > > On Wed, Nov 17, 2010 at 9:22 AM, Jean-Daniel Cryans <
> jdcryans@apache.org
> > > >wrote:
> > >
> > > > AFAIK most monitoring systems don't like dynamically-named metrics,
> > > > for example in ganglia you would end up with an ever growing number
> of
> > > > metrics for req/regions (one for each region that the region server
> > > > ever had). At the very least it should be included in the region
> > > > server report so that the master can take action and plan
> accordingly,
> > > > the new master has better facilities for that.
> > > >
> > > > J-D
> > > >
> > > > On Wed, Nov 17, 2010 at 8:15 AM, Lars George <la...@gmail.com>
> > > > wrote:
> > > > > JD,
> > > > >
> > > > > Should we create a metric for it so that it dynamically counts per
> > > > > region its usage? That can then be exposed via Ganglia context or
> > JMX.
> > > > > Just wondering.
> > > > >
> > > > > Lars
> > > > >
> > > > > On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <
> vpuranik@gmail.com
> > >
> > > > wrote:
> > > > >> hi,
> > > > >>
> > > > >> Thanks for the suggestions JD & Michael.
> > > > >> The region servers serving ROOT & META regions are fine.
> > > > >>
> > > > >> I will try analysing tcpdump output.
> > > > >>
> > > > >> Regards,
> > > > >> Vaibhav
> > > > >> GumGum
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <
> > > > michael_segel@hotmail.com>wrote:
> > > > >>
> > > > >>>
> > > > >>> Beyond this... which region is serving your ROOT and meta data?
> > > > >>>
> > > > >>> That node will probably get a higher load.
> > > > >>> Also, how many disks do you have and how many nodes?
> > > > >>> You could see higher CPU loads if you're I/O bound.
> > > > >>>
> > > > >>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> > > > >>> > Subject: Re: Correlating traffic with regions
> > > > >>> > From: jdcryans@apache.org
> > > > >>> > To: user@hbase.apache.org
> > > > >>> >
> > > > >>> > Yeah this is one area where HBase could do a much better job...
> > > > >>> > because there's not really a way to do it within the database.
> > One
> > > > >>> > thing you can do is to tcpdump a few seconds of traffic on that
> > > node
> > > > >>> > and decipher which tables (shown in the region name) are being
> > > used.
> > > > >>> >
> > > > >>> > J-D
> > > > >>> >
> > > > >>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <
> > > vpuranik@gmail.com
> > > > >
> > > > >>> wrote:
> > > > >>> > > Hi all,
> > > > >>> > >
> > > > >>> > > We are running 0.20.6 in production.
> > > > >>> > >
> > > > >>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering
> > near
> > > > 60%.
> > > > >>> But
> > > > >>> > > the node has many tables and many regions on it.
> > > > >>> > >
> > > > >>> > > Is there an easy way to find out which of these regions or
> > tables
> > > > are
> > > > >>> > > getting most of the traffic?
> > > > >>> > >
> > > > >>> > > Regards,
> > > > >>> > > Vaibhav Purnaik
> > > > >>> > > GumGum
> > > > >>> > >
> > > > >>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: Correlating traffic with regions

Posted by Vaibhav Puranik <vp...@gmail.com>.
Ted,

I looked at /usr/bin/curl http://$server:8080/status/cluster.

But there is no traffic data there. All the data this interface returns is
already available through HBase web interface.

Regards,
Vaibhav

>

On Thu, Nov 18, 2010 at 10:05 AM, Ted Yu <yu...@gmail.com> wrote:

> You can query Stargate.
> E.g.
> /usr/bin/curl http://$server:8080/status/cluster
>
> You can see region information in the output.
>
> On Thu, Nov 18, 2010 at 9:11 AM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > Meanwhile, I was able to roughly estimate which table is getting traffic
> by
> > executing the following commands:
> >
> > 1) Store ngrep output in a file (for few seconds)
> > ngrep -W byline port 60020 > temp.out
> >
> > 2) Find out all the tables that region server has from HBase user
> > interface.
> > For each table execute the following commands:
> > grep 'TableName,' temp.out | wc -l
> >
> > This was enough for us as even which table was getting hit would be very
> > useful information for us. I am guessing there should be a way to grep
> > region name too.
> >
> > Regards,
> > Vaibhav,
> > GumGum
> >
> >
> >
> > On Wed, Nov 17, 2010 at 9:22 AM, Jean-Daniel Cryans <jdcryans@apache.org
> > >wrote:
> >
> > > AFAIK most monitoring systems don't like dynamically-named metrics,
> > > for example in ganglia you would end up with an ever growing number of
> > > metrics for req/regions (one for each region that the region server
> > > ever had). At the very least it should be included in the region
> > > server report so that the master can take action and plan accordingly,
> > > the new master has better facilities for that.
> > >
> > > J-D
> > >
> > > On Wed, Nov 17, 2010 at 8:15 AM, Lars George <la...@gmail.com>
> > > wrote:
> > > > JD,
> > > >
> > > > Should we create a metric for it so that it dynamically counts per
> > > > region its usage? That can then be exposed via Ganglia context or
> JMX.
> > > > Just wondering.
> > > >
> > > > Lars
> > > >
> > > > On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vpuranik@gmail.com
> >
> > > wrote:
> > > >> hi,
> > > >>
> > > >> Thanks for the suggestions JD & Michael.
> > > >> The region servers serving ROOT & META regions are fine.
> > > >>
> > > >> I will try analysing tcpdump output.
> > > >>
> > > >> Regards,
> > > >> Vaibhav
> > > >> GumGum
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <
> > > michael_segel@hotmail.com>wrote:
> > > >>
> > > >>>
> > > >>> Beyond this... which region is serving your ROOT and meta data?
> > > >>>
> > > >>> That node will probably get a higher load.
> > > >>> Also, how many disks do you have and how many nodes?
> > > >>> You could see higher CPU loads if you're I/O bound.
> > > >>>
> > > >>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> > > >>> > Subject: Re: Correlating traffic with regions
> > > >>> > From: jdcryans@apache.org
> > > >>> > To: user@hbase.apache.org
> > > >>> >
> > > >>> > Yeah this is one area where HBase could do a much better job...
> > > >>> > because there's not really a way to do it within the database.
> One
> > > >>> > thing you can do is to tcpdump a few seconds of traffic on that
> > node
> > > >>> > and decipher which tables (shown in the region name) are being
> > used.
> > > >>> >
> > > >>> > J-D
> > > >>> >
> > > >>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <
> > vpuranik@gmail.com
> > > >
> > > >>> wrote:
> > > >>> > > Hi all,
> > > >>> > >
> > > >>> > > We are running 0.20.6 in production.
> > > >>> > >
> > > >>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering
> near
> > > 60%.
> > > >>> But
> > > >>> > > the node has many tables and many regions on it.
> > > >>> > >
> > > >>> > > Is there an easy way to find out which of these regions or
> tables
> > > are
> > > >>> > > getting most of the traffic?
> > > >>> > >
> > > >>> > > Regards,
> > > >>> > > Vaibhav Purnaik
> > > >>> > > GumGum
> > > >>> > >
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> >
>

Re: Correlating traffic with regions

Posted by Ted Yu <yu...@gmail.com>.
You can query Stargate.
E.g.
/usr/bin/curl http://$server:8080/status/cluster

You can see region information in the output.

On Thu, Nov 18, 2010 at 9:11 AM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Meanwhile, I was able to roughly estimate which table is getting traffic by
> executing the following commands:
>
> 1) Store ngrep output in a file (for few seconds)
> ngrep -W byline port 60020 > temp.out
>
> 2) Find out all the tables that region server has from HBase user
> interface.
> For each table execute the following commands:
> grep 'TableName,' temp.out | wc -l
>
> This was enough for us as even which table was getting hit would be very
> useful information for us. I am guessing there should be a way to grep
> region name too.
>
> Regards,
> Vaibhav,
> GumGum
>
>
>
> On Wed, Nov 17, 2010 at 9:22 AM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
>
> > AFAIK most monitoring systems don't like dynamically-named metrics,
> > for example in ganglia you would end up with an ever growing number of
> > metrics for req/regions (one for each region that the region server
> > ever had). At the very least it should be included in the region
> > server report so that the master can take action and plan accordingly,
> > the new master has better facilities for that.
> >
> > J-D
> >
> > On Wed, Nov 17, 2010 at 8:15 AM, Lars George <la...@gmail.com>
> > wrote:
> > > JD,
> > >
> > > Should we create a metric for it so that it dynamically counts per
> > > region its usage? That can then be exposed via Ganglia context or JMX.
> > > Just wondering.
> > >
> > > Lars
> > >
> > > On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> > >> hi,
> > >>
> > >> Thanks for the suggestions JD & Michael.
> > >> The region servers serving ROOT & META regions are fine.
> > >>
> > >> I will try analysing tcpdump output.
> > >>
> > >> Regards,
> > >> Vaibhav
> > >> GumGum
> > >>
> > >>
> > >>
> > >> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <
> > michael_segel@hotmail.com>wrote:
> > >>
> > >>>
> > >>> Beyond this... which region is serving your ROOT and meta data?
> > >>>
> > >>> That node will probably get a higher load.
> > >>> Also, how many disks do you have and how many nodes?
> > >>> You could see higher CPU loads if you're I/O bound.
> > >>>
> > >>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> > >>> > Subject: Re: Correlating traffic with regions
> > >>> > From: jdcryans@apache.org
> > >>> > To: user@hbase.apache.org
> > >>> >
> > >>> > Yeah this is one area where HBase could do a much better job...
> > >>> > because there's not really a way to do it within the database. One
> > >>> > thing you can do is to tcpdump a few seconds of traffic on that
> node
> > >>> > and decipher which tables (shown in the region name) are being
> used.
> > >>> >
> > >>> > J-D
> > >>> >
> > >>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <
> vpuranik@gmail.com
> > >
> > >>> wrote:
> > >>> > > Hi all,
> > >>> > >
> > >>> > > We are running 0.20.6 in production.
> > >>> > >
> > >>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near
> > 60%.
> > >>> But
> > >>> > > the node has many tables and many regions on it.
> > >>> > >
> > >>> > > Is there an easy way to find out which of these regions or tables
> > are
> > >>> > > getting most of the traffic?
> > >>> > >
> > >>> > > Regards,
> > >>> > > Vaibhav Purnaik
> > >>> > > GumGum
> > >>> > >
> > >>>
> > >>>
> > >>
> > >
> >
>

Re: Correlating traffic with regions

Posted by Vaibhav Puranik <vp...@gmail.com>.
Meanwhile, I was able to roughly estimate which table is getting traffic by
executing the following commands:

1) Store ngrep output in a file (for few seconds)
ngrep -W byline port 60020 > temp.out

2) Find out all the tables that region server has from HBase user interface.
For each table execute the following commands:
grep 'TableName,' temp.out | wc -l

This was enough for us as even which table was getting hit would be very
useful information for us. I am guessing there should be a way to grep
region name too.

Regards,
Vaibhav,
GumGum



On Wed, Nov 17, 2010 at 9:22 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> AFAIK most monitoring systems don't like dynamically-named metrics,
> for example in ganglia you would end up with an ever growing number of
> metrics for req/regions (one for each region that the region server
> ever had). At the very least it should be included in the region
> server report so that the master can take action and plan accordingly,
> the new master has better facilities for that.
>
> J-D
>
> On Wed, Nov 17, 2010 at 8:15 AM, Lars George <la...@gmail.com>
> wrote:
> > JD,
> >
> > Should we create a metric for it so that it dynamically counts per
> > region its usage? That can then be exposed via Ganglia context or JMX.
> > Just wondering.
> >
> > Lars
> >
> > On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
> >> hi,
> >>
> >> Thanks for the suggestions JD & Michael.
> >> The region servers serving ROOT & META regions are fine.
> >>
> >> I will try analysing tcpdump output.
> >>
> >> Regards,
> >> Vaibhav
> >> GumGum
> >>
> >>
> >>
> >> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <
> michael_segel@hotmail.com>wrote:
> >>
> >>>
> >>> Beyond this... which region is serving your ROOT and meta data?
> >>>
> >>> That node will probably get a higher load.
> >>> Also, how many disks do you have and how many nodes?
> >>> You could see higher CPU loads if you're I/O bound.
> >>>
> >>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> >>> > Subject: Re: Correlating traffic with regions
> >>> > From: jdcryans@apache.org
> >>> > To: user@hbase.apache.org
> >>> >
> >>> > Yeah this is one area where HBase could do a much better job...
> >>> > because there's not really a way to do it within the database. One
> >>> > thing you can do is to tcpdump a few seconds of traffic on that node
> >>> > and decipher which tables (shown in the region name) are being used.
> >>> >
> >>> > J-D
> >>> >
> >>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vpuranik@gmail.com
> >
> >>> wrote:
> >>> > > Hi all,
> >>> > >
> >>> > > We are running 0.20.6 in production.
> >>> > >
> >>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near
> 60%.
> >>> But
> >>> > > the node has many tables and many regions on it.
> >>> > >
> >>> > > Is there an easy way to find out which of these regions or tables
> are
> >>> > > getting most of the traffic?
> >>> > >
> >>> > > Regards,
> >>> > > Vaibhav Purnaik
> >>> > > GumGum
> >>> > >
> >>>
> >>>
> >>
> >
>

Re: Correlating traffic with regions

Posted by Jean-Daniel Cryans <jd...@apache.org>.
AFAIK most monitoring systems don't like dynamically-named metrics,
for example in ganglia you would end up with an ever growing number of
metrics for req/regions (one for each region that the region server
ever had). At the very least it should be included in the region
server report so that the master can take action and plan accordingly,
the new master has better facilities for that.

J-D

On Wed, Nov 17, 2010 at 8:15 AM, Lars George <la...@gmail.com> wrote:
> JD,
>
> Should we create a metric for it so that it dynamically counts per
> region its usage? That can then be exposed via Ganglia context or JMX.
> Just wondering.
>
> Lars
>
> On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
>> hi,
>>
>> Thanks for the suggestions JD & Michael.
>> The region servers serving ROOT & META regions are fine.
>>
>> I will try analysing tcpdump output.
>>
>> Regards,
>> Vaibhav
>> GumGum
>>
>>
>>
>> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <mi...@hotmail.com>wrote:
>>
>>>
>>> Beyond this... which region is serving your ROOT and meta data?
>>>
>>> That node will probably get a higher load.
>>> Also, how many disks do you have and how many nodes?
>>> You could see higher CPU loads if you're I/O bound.
>>>
>>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
>>> > Subject: Re: Correlating traffic with regions
>>> > From: jdcryans@apache.org
>>> > To: user@hbase.apache.org
>>> >
>>> > Yeah this is one area where HBase could do a much better job...
>>> > because there's not really a way to do it within the database. One
>>> > thing you can do is to tcpdump a few seconds of traffic on that node
>>> > and decipher which tables (shown in the region name) are being used.
>>> >
>>> > J-D
>>> >
>>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com>
>>> wrote:
>>> > > Hi all,
>>> > >
>>> > > We are running 0.20.6 in production.
>>> > >
>>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%.
>>> But
>>> > > the node has many tables and many regions on it.
>>> > >
>>> > > Is there an easy way to find out which of these regions or tables are
>>> > > getting most of the traffic?
>>> > >
>>> > > Regards,
>>> > > Vaibhav Purnaik
>>> > > GumGum
>>> > >
>>>
>>>
>>
>

Re: Correlating traffic with regions

Posted by Himanshu Vashishtha <hv...@cs.ualberta.ca>.
I was thinking whether coprocessors framework can be used to do such house
keeping jobs: how loaded is a region, how many scan/put/get operations, etc
(as there are pre-post of almost all possible operations at region level (
in this case client side operations via RegionObserver interface). or may be
me being 'microscopically' focussed on that framework :-))

Himanshu

On Wed, Nov 17, 2010 at 9:15 AM, Lars George <la...@gmail.com> wrote:

> JD,
>
> Should we create a metric for it so that it dynamically counts per
> region its usage? That can then be exposed via Ganglia context or JMX.
> Just wondering.
>
> Lars
>
> On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
> > hi,
> >
> > Thanks for the suggestions JD & Michael.
> > The region servers serving ROOT & META regions are fine.
> >
> > I will try analysing tcpdump output.
> >
> > Regards,
> > Vaibhav
> > GumGum
> >
> >
> >
> > On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <
> michael_segel@hotmail.com>wrote:
> >
> >>
> >> Beyond this... which region is serving your ROOT and meta data?
> >>
> >> That node will probably get a higher load.
> >> Also, how many disks do you have and how many nodes?
> >> You could see higher CPU loads if you're I/O bound.
> >>
> >> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> >> > Subject: Re: Correlating traffic with regions
> >> > From: jdcryans@apache.org
> >> > To: user@hbase.apache.org
> >> >
> >> > Yeah this is one area where HBase could do a much better job...
> >> > because there's not really a way to do it within the database. One
> >> > thing you can do is to tcpdump a few seconds of traffic on that node
> >> > and decipher which tables (shown in the region name) are being used.
> >> >
> >> > J-D
> >> >
> >> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com>
> >> wrote:
> >> > > Hi all,
> >> > >
> >> > > We are running 0.20.6 in production.
> >> > >
> >> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near
> 60%.
> >> But
> >> > > the node has many tables and many regions on it.
> >> > >
> >> > > Is there an easy way to find out which of these regions or tables
> are
> >> > > getting most of the traffic?
> >> > >
> >> > > Regards,
> >> > > Vaibhav Purnaik
> >> > > GumGum
> >> > >
> >>
> >>
> >
>

Re: Correlating traffic with regions

Posted by Lars George <la...@gmail.com>.
JD,

Should we create a metric for it so that it dynamically counts per
region its usage? That can then be exposed via Ganglia context or JMX.
Just wondering.

Lars

On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
> hi,
>
> Thanks for the suggestions JD & Michael.
> The region servers serving ROOT & META regions are fine.
>
> I will try analysing tcpdump output.
>
> Regards,
> Vaibhav
> GumGum
>
>
>
> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <mi...@hotmail.com>wrote:
>
>>
>> Beyond this... which region is serving your ROOT and meta data?
>>
>> That node will probably get a higher load.
>> Also, how many disks do you have and how many nodes?
>> You could see higher CPU loads if you're I/O bound.
>>
>> > Date: Mon, 15 Nov 2010 18:24:31 -0800
>> > Subject: Re: Correlating traffic with regions
>> > From: jdcryans@apache.org
>> > To: user@hbase.apache.org
>> >
>> > Yeah this is one area where HBase could do a much better job...
>> > because there's not really a way to do it within the database. One
>> > thing you can do is to tcpdump a few seconds of traffic on that node
>> > and decipher which tables (shown in the region name) are being used.
>> >
>> > J-D
>> >
>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com>
>> wrote:
>> > > Hi all,
>> > >
>> > > We are running 0.20.6 in production.
>> > >
>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%.
>> But
>> > > the node has many tables and many regions on it.
>> > >
>> > > Is there an easy way to find out which of these regions or tables are
>> > > getting most of the traffic?
>> > >
>> > > Regards,
>> > > Vaibhav Purnaik
>> > > GumGum
>> > >
>>
>>
>

Re: Correlating traffic with regions

Posted by Vaibhav Puranik <vp...@gmail.com>.
hi,

Thanks for the suggestions JD & Michael.
The region servers serving ROOT & META regions are fine.

I will try analysing tcpdump output.

Regards,
Vaibhav
GumGum



On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel <mi...@hotmail.com>wrote:

>
> Beyond this... which region is serving your ROOT and meta data?
>
> That node will probably get a higher load.
> Also, how many disks do you have and how many nodes?
> You could see higher CPU loads if you're I/O bound.
>
> > Date: Mon, 15 Nov 2010 18:24:31 -0800
> > Subject: Re: Correlating traffic with regions
> > From: jdcryans@apache.org
> > To: user@hbase.apache.org
> >
> > Yeah this is one area where HBase could do a much better job...
> > because there's not really a way to do it within the database. One
> > thing you can do is to tcpdump a few seconds of traffic on that node
> > and decipher which tables (shown in the region name) are being used.
> >
> > J-D
> >
> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
> > > Hi all,
> > >
> > > We are running 0.20.6 in production.
> > >
> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%.
> But
> > > the node has many tables and many regions on it.
> > >
> > > Is there an easy way to find out which of these regions or tables are
> > > getting most of the traffic?
> > >
> > > Regards,
> > > Vaibhav Purnaik
> > > GumGum
> > >
>
>

RE: Correlating traffic with regions

Posted by Michael Segel <mi...@hotmail.com>.
Beyond this... which region is serving your ROOT and meta data?

That node will probably get a higher load.
Also, how many disks do you have and how many nodes?
You could see higher CPU loads if you're I/O bound.

> Date: Mon, 15 Nov 2010 18:24:31 -0800
> Subject: Re: Correlating traffic with regions
> From: jdcryans@apache.org
> To: user@hbase.apache.org
> 
> Yeah this is one area where HBase could do a much better job...
> because there's not really a way to do it within the database. One
> thing you can do is to tcpdump a few seconds of traffic on that node
> and decipher which tables (shown in the region name) are being used.
> 
> J-D
> 
> On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
> > Hi all,
> >
> > We are running 0.20.6 in production.
> >
> > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%. But
> > the node has many tables and many regions on it.
> >
> > Is there an easy way to find out which of these regions or tables are
> > getting most of the traffic?
> >
> > Regards,
> > Vaibhav Purnaik
> > GumGum
> >
 		 	   		  

Re: Correlating traffic with regions

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Yeah this is one area where HBase could do a much better job...
because there's not really a way to do it within the database. One
thing you can do is to tcpdump a few seconds of traffic on that node
and decipher which tables (shown in the region name) are being used.

J-D

On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
> Hi all,
>
> We are running 0.20.6 in production.
>
> On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near 60%. But
> the node has many tables and many regions on it.
>
> Is there an easy way to find out which of these regions or tables are
> getting most of the traffic?
>
> Regards,
> Vaibhav Purnaik
> GumGum
>