You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mukesh Jha <me...@gmail.com> on 2015/11/27 19:11:26 UTC

High get/scan rates on HBase table even if no readers are on

I'm working with cloudera hbase v0.98, my HBase table has ~5k regions.

>From the cloudera UI charts i see a lot of get & scan operations active on
my table even after i shut down all the reader applications.

I'm suspecting that this is impacting my scan performance.

So I'd like to know if there is a way by which i can identify the hosts
calling  these get/scan operations? I tried netstat and similar linux
commands without much luck.

Re: High get/scan rates on HBase table even if no readers are on

Posted by Samir Ahmic <ah...@gmail.com>.
HI Mukesh,
Did you try to change logging levels in in
$HBASE_CONF_DIR/log4j.properties?  You can enable  this lines to get debug
info in lour logs:

# Enable this to get detailed connection error/retry logging.
#
log4j.logger.org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation=TRACE
# Uncomment this line to enable tracing on _every_ RPC call (this can be a
lot of output)
#log4j.logger.org.apache.hadoop.ipc.HBaseServer.trace=DEBUG

Regards
Samir

On Mon, Nov 30, 2015 at 10:14 AM, Mukesh Jha <me...@gmail.com>
wrote:

> Any clue guys?
>
> Because of this I am getting a lot of slow scans.
>
> From HBase Regionserver logs
>
> hbase5.usdc2.cloud.com 2015-11-30 09:10:53,592 WARN
> org.apache.hadoop.ipc.RpcServer: (responseTooSlow):
>
> {"processingtimems":10630,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"
> 10.193.150.127:37070
>
> ","starttimems":1448874642962,"queuetimems":1,"class":"HRegionServer","responsesize":12,"method":"Scan"}
>
>
> On Fri, Nov 27, 2015 at 11:41 PM, Mukesh Jha <me...@gmail.com>
> wrote:
>
> > I'm working with cloudera hbase v0.98, my HBase table has ~5k regions.
> >
> > From the cloudera UI charts i see a lot of get & scan operations active
> on
> > my table even after i shut down all the reader applications.
> >
> > I'm suspecting that this is impacting my scan performance.
> >
> > So I'd like to know if there is a way by which i can identify the hosts
> > calling  these get/scan operations? I tried netstat and similar linux
> > commands without much luck.
> >
>
>
>
> --
>
>
> Thanks & Regards,
>
> *Mukesh Jha <me...@gmail.com>*
>

Re: High get/scan rates on HBase table even if no readers are on

Posted by Mukesh Jha <me...@gmail.com>.
Any clue guys?

Because of this I am getting a lot of slow scans.

>From HBase Regionserver logs

hbase5.usdc2.cloud.com 2015-11-30 09:10:53,592 WARN
org.apache.hadoop.ipc.RpcServer: (responseTooSlow):
{"processingtimems":10630,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"
10.193.150.127:37070
","starttimems":1448874642962,"queuetimems":1,"class":"HRegionServer","responsesize":12,"method":"Scan"}


On Fri, Nov 27, 2015 at 11:41 PM, Mukesh Jha <me...@gmail.com>
wrote:

> I'm working with cloudera hbase v0.98, my HBase table has ~5k regions.
>
> From the cloudera UI charts i see a lot of get & scan operations active on
> my table even after i shut down all the reader applications.
>
> I'm suspecting that this is impacting my scan performance.
>
> So I'd like to know if there is a way by which i can identify the hosts
> calling  these get/scan operations? I tried netstat and similar linux
> commands without much luck.
>



-- 


Thanks & Regards,

*Mukesh Jha <me...@gmail.com>*

Re: High get/scan rates on HBase table even if no readers are on

Posted by Junegunn Choi <ju...@gmail.com>.
We had a similar issue a while ago, and it was HBase Canary on each
region server scanning hbase:meta region every few seconds. You can
try disabling "HBase Canary" on Cloudera Manager configuration page
and restarting the servers.

But if you don't wish to restart for some reason, manually killing
Canary processes (org.apache.hadoop.hbase.tool.Canary) will do.
Note that we had to kill their parent processes first to prevent
respawning.

To see which regions are currently being accessed, you might want
to check out tools like hbase-region-inspector.

https://github.com/kakao/hbase-region-inspector

- junegunn

Re: High get/scan rates on HBase table even if no readers are on

Posted by Mukesh Jha <me...@gmail.com>.
On Mon, Nov 30, 2015 at 10:35 PM, Stack <st...@duboce.net> wrote:

> On Fri, Nov 27, 2015 at 10:11 AM, Mukesh Jha <me...@gmail.com>
> wrote:
>
> > I'm working with cloudera hbase v0.98, my HBase table has ~5k regions.
> >
> >
> How many servers do you have carrying the 5k regions?
>
I've 50 nodes hosting these regions.

>
>
> > From the cloudera UI charts i see a lot of get & scan operations active
> on
> > my table even after i shut down all the reader applications.
> >
> >
> Then, there must be an application still running?

I think cloudera's total_get_rates care cumulative in nature
(total_read_requests_rate_across_regionservers but graph sows rate in
ops/sec so still confused here) and hence are showing up in the graph. When
I check per table get/scan rates () they come down to 0 on bringing down
all the applications.

SELECT total_scan_next_rate_across_hregions // shows rate at ~5k
operations/sec
SELECT scan_next_rate // shows~500 operations/sec

>
>
> > I'm suspecting that this is impacting my scan performance.
> >
> > So I'd like to know if there is a way by which i can identify the hosts
> > calling  these get/scan operations? I tried netstat and similar linux
> > commands without much luck.
> >
>
> You can do as Samir suggests. You could also do it on one server only
> temporarily via the RegionServer UI. Look along the top of the webpage for
> Log Level.
>
I'm planning to do that but that'd need a regions server restart, is there
any other way I can trace the calls?

>
> St.Ack
>



-- 


Thanks & Regards,

*Mukesh Jha <me...@gmail.com>*

Re: High get/scan rates on HBase table even if no readers are on

Posted by Stack <st...@duboce.net>.
On Fri, Nov 27, 2015 at 10:11 AM, Mukesh Jha <me...@gmail.com>
wrote:

> I'm working with cloudera hbase v0.98, my HBase table has ~5k regions.
>
>
How many servers do you have carrying the 5k regions?


> From the cloudera UI charts i see a lot of get & scan operations active on
> my table even after i shut down all the reader applications.
>
>
Then, there must be an application still running?


> I'm suspecting that this is impacting my scan performance.
>
> So I'd like to know if there is a way by which i can identify the hosts
> calling  these get/scan operations? I tried netstat and similar linux
> commands without much luck.
>

You can do as Samir suggests. You could also do it on one server only
temporarily via the RegionServer UI. Look along the top of the webpage for
Log Level.

St.Ack