You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Buckley,Ron" <bu...@oclc.org> on 2013/04/23 15:57:07 UTC

HBase Region Server Spinning on JMX requests

This is with HBase 0.94.4 & CDH 4.1.1

This morning one our region servers (we have 44) stopped responding to
the '/jmx' request. (It's working for regular activity.)  Additionally,
the region server is now using all the CPU on the host, running all 8
cores at 100%.

I've got several jstacks, they all look like this:
http://pastebin.com/dGTmTEN7

If I do a wget of the /jmx url, it starts responding, but never
completes, always stopping at the same point:
http://pastebin.com/qhNvxrQK

Has anyone ever seen this before? If so, Is there a way out of it?
(other than bouncing the region server).

BTW: There's nothing relevant in the region server log and the garbage
collector log is normal.


----------------------------------------------------------------------
Ron Buckley



Re: HBase Region Server Spinning on JMX requests

Posted by Elliott Clark <ec...@apache.org>.
I've seen this before.  There were some efforts to make sure that
modifications to metrics weren't reentrant.  That helped but didn't
completely make this go away.

HBase has moved to the metrics2 system in trunk which should make 96
immune to this (we don't use MetricsDynamicMBeanBase at all).  But I
would love a real solution for 0.94.  Just making everything
synchronized scares me a whole lot for perf.

On Wed, Apr 24, 2013 at 9:31 AM, Andrew Purtell <ap...@apache.org> wrote:
> This is probably unprotected concurrent access to a HashMap in Hadoop
> metrics. See comments on https://issues.apache.org/jira/browse/HBASE-8416
>
>
> On Wed, Apr 24, 2013 at 4:37 AM, Buckley,Ron <bu...@oclc.org> wrote:
>
>> I created https://issues.apache.org/jira/browse/HBASE-8416
>>
>> We're not using OpenTSDB, but we do have something similar grabbing the
>> jmx data on a regular basis.
>>
>> Eventually, we moved all the regions off of that region server.  We left
>> it spinning overnight, going to try to look at it this morning.
>>
>>
>> -----Original Message-----
>> From: Kevin O'dell [mailto:kevin.odell@cloudera.com]
>> Sent: Tuesday, April 23, 2013 11:04 PM
>> To: user@hbase.apache.org; lars hofhansl
>> Subject: Re: HBase Region Server Spinning on JMX requests
>>
>> Hi Ron,
>>
>>   Are you using OpenTSDB?  I have seen:
>>
>> https://issues.apache.org/jira/browse/HBASE-6602 (which should be
>> addressed in your build).  One possibility is that the Tcollector is
>> leaving lots of connections open and causing the spin.  Unfortunately,
>> we have not been able to nail it down further.  We are thinking
>> Metrics2 in trunk might inadvertently take care of this issue.
>>
>> On Tue, Apr 23, 2013 at 6:57 PM, lars hofhansl <la...@apache.org> wrote:
>> > Hmm... That's not good. Would you mind filing a ticket here:
>> https://issues.apache.org/jira/browse/HBASE ?
>> >
>> > -- Lars
>> >
>> >
>> > ________________________________
>> >  From: "Buckley,Ron" <bu...@oclc.org>
>> > To: user@hbase.apache.org
>> > Sent: Tuesday, April 23, 2013 6:57 AM
>> > Subject: HBase Region Server Spinning on JMX requests
>> >
>> >
>> > This is with HBase 0.94.4 & CDH 4.1.1
>> >
>> > This morning one our region servers (we have 44) stopped responding to
>> > the '/jmx' request. (It's working for regular activity.)
>> Additionally,
>> > the region server is now using all the CPU on the host, running all 8
>> > cores at 100%.
>> >
>> > I've got several jstacks, they all look like this:
>> > http://pastebin.com/dGTmTEN7
>> >
>> > If I do a wget of the /jmx url, it starts responding, but never
>> > completes, always stopping at the same point:
>> > http://pastebin.com/qhNvxrQK
>> >
>> > Has anyone ever seen this before? If so, Is there a way out of it?
>> > (other than bouncing the region server).
>> >
>> > BTW: There's nothing relevant in the region server log and the garbage
>> > collector log is normal.
>> >
>> >
>> > ----------------------------------------------------------------------
>> > Ron Buckley
>>
>>
>>
>> --
>> Kevin O'Dell
>> Systems Engineer, Cloudera
>>
>>
>>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

Re: HBase Region Server Spinning on JMX requests

Posted by Andrew Purtell <ap...@apache.org>.
This is probably unprotected concurrent access to a HashMap in Hadoop
metrics. See comments on https://issues.apache.org/jira/browse/HBASE-8416


On Wed, Apr 24, 2013 at 4:37 AM, Buckley,Ron <bu...@oclc.org> wrote:

> I created https://issues.apache.org/jira/browse/HBASE-8416
>
> We're not using OpenTSDB, but we do have something similar grabbing the
> jmx data on a regular basis.
>
> Eventually, we moved all the regions off of that region server.  We left
> it spinning overnight, going to try to look at it this morning.
>
>
> -----Original Message-----
> From: Kevin O'dell [mailto:kevin.odell@cloudera.com]
> Sent: Tuesday, April 23, 2013 11:04 PM
> To: user@hbase.apache.org; lars hofhansl
> Subject: Re: HBase Region Server Spinning on JMX requests
>
> Hi Ron,
>
>   Are you using OpenTSDB?  I have seen:
>
> https://issues.apache.org/jira/browse/HBASE-6602 (which should be
> addressed in your build).  One possibility is that the Tcollector is
> leaving lots of connections open and causing the spin.  Unfortunately,
> we have not been able to nail it down further.  We are thinking
> Metrics2 in trunk might inadvertently take care of this issue.
>
> On Tue, Apr 23, 2013 at 6:57 PM, lars hofhansl <la...@apache.org> wrote:
> > Hmm... That's not good. Would you mind filing a ticket here:
> https://issues.apache.org/jira/browse/HBASE ?
> >
> > -- Lars
> >
> >
> > ________________________________
> >  From: "Buckley,Ron" <bu...@oclc.org>
> > To: user@hbase.apache.org
> > Sent: Tuesday, April 23, 2013 6:57 AM
> > Subject: HBase Region Server Spinning on JMX requests
> >
> >
> > This is with HBase 0.94.4 & CDH 4.1.1
> >
> > This morning one our region servers (we have 44) stopped responding to
> > the '/jmx' request. (It's working for regular activity.)
> Additionally,
> > the region server is now using all the CPU on the host, running all 8
> > cores at 100%.
> >
> > I've got several jstacks, they all look like this:
> > http://pastebin.com/dGTmTEN7
> >
> > If I do a wget of the /jmx url, it starts responding, but never
> > completes, always stopping at the same point:
> > http://pastebin.com/qhNvxrQK
> >
> > Has anyone ever seen this before? If so, Is there a way out of it?
> > (other than bouncing the region server).
> >
> > BTW: There's nothing relevant in the region server log and the garbage
> > collector log is normal.
> >
> >
> > ----------------------------------------------------------------------
> > Ron Buckley
>
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

RE: HBase Region Server Spinning on JMX requests

Posted by "Buckley,Ron" <bu...@oclc.org>.
I created https://issues.apache.org/jira/browse/HBASE-8416

We're not using OpenTSDB, but we do have something similar grabbing the
jmx data on a regular basis.

Eventually, we moved all the regions off of that region server.  We left
it spinning overnight, going to try to look at it this morning.


-----Original Message-----
From: Kevin O'dell [mailto:kevin.odell@cloudera.com] 
Sent: Tuesday, April 23, 2013 11:04 PM
To: user@hbase.apache.org; lars hofhansl
Subject: Re: HBase Region Server Spinning on JMX requests

Hi Ron,

  Are you using OpenTSDB?  I have seen:

https://issues.apache.org/jira/browse/HBASE-6602 (which should be
addressed in your build).  One possibility is that the Tcollector is
leaving lots of connections open and causing the spin.  Unfortunately,
we have not been able to nail it down further.  We are thinking
Metrics2 in trunk might inadvertently take care of this issue.

On Tue, Apr 23, 2013 at 6:57 PM, lars hofhansl <la...@apache.org> wrote:
> Hmm... That's not good. Would you mind filing a ticket here:
https://issues.apache.org/jira/browse/HBASE ?
>
> -- Lars
>
>
> ________________________________
>  From: "Buckley,Ron" <bu...@oclc.org>
> To: user@hbase.apache.org
> Sent: Tuesday, April 23, 2013 6:57 AM
> Subject: HBase Region Server Spinning on JMX requests
>
>
> This is with HBase 0.94.4 & CDH 4.1.1
>
> This morning one our region servers (we have 44) stopped responding to
> the '/jmx' request. (It's working for regular activity.)
Additionally,
> the region server is now using all the CPU on the host, running all 8
> cores at 100%.
>
> I've got several jstacks, they all look like this:
> http://pastebin.com/dGTmTEN7
>
> If I do a wget of the /jmx url, it starts responding, but never
> completes, always stopping at the same point:
> http://pastebin.com/qhNvxrQK
>
> Has anyone ever seen this before? If so, Is there a way out of it?
> (other than bouncing the region server).
>
> BTW: There's nothing relevant in the region server log and the garbage
> collector log is normal.
>
>
> ----------------------------------------------------------------------
> Ron Buckley



-- 
Kevin O'Dell
Systems Engineer, Cloudera



Re: HBase Region Server Spinning on JMX requests

Posted by Kevin O'dell <ke...@cloudera.com>.
Hi Ron,

  Are you using OpenTSDB?  I have seen:

https://issues.apache.org/jira/browse/HBASE-6602 (which should be
addressed in your build).  One possibility is that the Tcollector is
leaving lots of connections open and causing the spin.  Unfortunately,
we have not been able to nail it down further.  We are thinking
Metrics2 in trunk might inadvertently take care of this issue.

On Tue, Apr 23, 2013 at 6:57 PM, lars hofhansl <la...@apache.org> wrote:
> Hmm... That's not good. Would you mind filing a ticket here: https://issues.apache.org/jira/browse/HBASE ?
>
> -- Lars
>
>
> ________________________________
>  From: "Buckley,Ron" <bu...@oclc.org>
> To: user@hbase.apache.org
> Sent: Tuesday, April 23, 2013 6:57 AM
> Subject: HBase Region Server Spinning on JMX requests
>
>
> This is with HBase 0.94.4 & CDH 4.1.1
>
> This morning one our region servers (we have 44) stopped responding to
> the '/jmx' request. (It's working for regular activity.)  Additionally,
> the region server is now using all the CPU on the host, running all 8
> cores at 100%.
>
> I've got several jstacks, they all look like this:
> http://pastebin.com/dGTmTEN7
>
> If I do a wget of the /jmx url, it starts responding, but never
> completes, always stopping at the same point:
> http://pastebin.com/qhNvxrQK
>
> Has anyone ever seen this before? If so, Is there a way out of it?
> (other than bouncing the region server).
>
> BTW: There's nothing relevant in the region server log and the garbage
> collector log is normal.
>
>
> ----------------------------------------------------------------------
> Ron Buckley



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Re: HBase Region Server Spinning on JMX requests

Posted by lars hofhansl <la...@apache.org>.
Hmm... That's not good. Would you mind filing a ticket here: https://issues.apache.org/jira/browse/HBASE ?

-- Lars


________________________________
 From: "Buckley,Ron" <bu...@oclc.org>
To: user@hbase.apache.org 
Sent: Tuesday, April 23, 2013 6:57 AM
Subject: HBase Region Server Spinning on JMX requests
 

This is with HBase 0.94.4 & CDH 4.1.1

This morning one our region servers (we have 44) stopped responding to
the '/jmx' request. (It's working for regular activity.)  Additionally,
the region server is now using all the CPU on the host, running all 8
cores at 100%.

I've got several jstacks, they all look like this:
http://pastebin.com/dGTmTEN7

If I do a wget of the /jmx url, it starts responding, but never
completes, always stopping at the same point:
http://pastebin.com/qhNvxrQK

Has anyone ever seen this before? If so, Is there a way out of it?
(other than bouncing the region server).

BTW: There's nothing relevant in the region server log and the garbage
collector log is normal.


----------------------------------------------------------------------
Ron Buckley