You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Yossi Ittach <yo...@gmail.com> on 2008/10/22 17:01:59 UTC

HBase (0.18) RegionServer : "unable to report to master for X ms - aborting server"

Hi all

I'm using HBase 0.18 and Hadoop 0.18.1 and I'm running some benchmarks.

Every now and then , a RegionServer prints this error msg :""unable to
report to master for X ms - aborting server"  and aborts.

However , both the master and the server have communication , and there's no
apparent problem (the server the runs the RegionServer runs a few more apps
, and they're perfectly responsive) . Using telnet on the Master from the
RegionServer machine works perfectly.

I've seen this issue :
https://issues.apache.org/jira/browse/HBASE-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566399#action_12566399

 which seems to be exactly my problem - but the fix in 505.patch is for
HBase 0.2.0 or 0.1.1

Any Idea?


Vale et me ama
Yossi

RE: HBase (0.18) RegionServer : "unable to report to master for X ms - aborting server"

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
Billy is right. HBASE-412 is caused by HBASE-616.
HBASE-505 is somewhat relevant, however if the region server
cannot communicate with the master, it cannot pass on the
"open in progress" messages.

This is typically seen on machines where there are too many
threads running and because thread scheduling is not "fair",
some threads get starved for cpu.

Can you tell us a bit about your hardware and what other
processes are running on the machine that has the region
server on it?

By the way, the patch for 505 is in 0.1.x, 0.2.x, 0.18.x
(which would have been called 0.3.x, but we changed release
numbering to coincide with hadoop's). The patch is also in
trunk.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: news [mailto:news@ger.gmane.org] On Behalf Of Billy Pearson
> Sent: Wednesday, October 22, 2008 8:13 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: HBase (0.18) RegionServer : "unable to report to master for X
> ms - aborting server"
>
> I thank your problem has an open issue here
> https://issues.apache.org/jira/browse/HBASE-616
>
> Billy
>
>
> "Yossi Ittach" <yo...@gmail.com> wrote in
> message
> news:d8c3ded50810220801i1df34712la62aa93f424f9243@mail.gmail.com...
> > Hi all
> >
> > I'm using HBase 0.18 and Hadoop 0.18.1 and I'm running some benchmarks.
> >
> > Every now and then , a RegionServer prints this error msg :""unable to
> > report to master for X ms - aborting server"  and aborts.
> >
> > However , both the master and the server have communication , and
> there's
> > no
> > apparent problem (the server the runs the RegionServer runs a few more
> > apps
> > , and they're perfectly responsive) . Using telnet on the Master from
> the
> > RegionServer machine works perfectly.
> >
> > I've seen this issue :
> > https://issues.apache.org/jira/browse/HBASE-
> 412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel&focusedCommentId=12566399#action_12566399
> >
> > which seems to be exactly my problem - but the fix in 505.patch is for
> > HBase 0.2.0 or 0.1.1
> >
> > Any Idea?
> >
> >
> > Vale et me ama
> > Yossi
> >
>
>


Re: HBase (0.18) RegionServer : "unable to report to master for X ms - aborting server"

Posted by Billy Pearson <sa...@pearsonwholesale.com>.
I thank your problem has an open issue here
https://issues.apache.org/jira/browse/HBASE-616

Billy


"Yossi Ittach" <yo...@gmail.com> wrote in 
message news:d8c3ded50810220801i1df34712la62aa93f424f9243@mail.gmail.com...
> Hi all
>
> I'm using HBase 0.18 and Hadoop 0.18.1 and I'm running some benchmarks.
>
> Every now and then , a RegionServer prints this error msg :""unable to
> report to master for X ms - aborting server"  and aborts.
>
> However , both the master and the server have communication , and there's 
> no
> apparent problem (the server the runs the RegionServer runs a few more 
> apps
> , and they're perfectly responsive) . Using telnet on the Master from the
> RegionServer machine works perfectly.
>
> I've seen this issue :
> https://issues.apache.org/jira/browse/HBASE-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566399#action_12566399
>
> which seems to be exactly my problem - but the fix in 505.patch is for
> HBase 0.2.0 or 0.1.1
>
> Any Idea?
>
>
> Vale et me ama
> Yossi
>