You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Barry Haddow <bh...@inf.ed.ac.uk> on 2008/09/29 16:18:19 UTC
Region servers shut down with UnknownScannerException
Hi
I recently set up a small hbase cluster (v 0.18) running on top of hadoop
v.0.18.1. However I'm observing that the region servers spontaneously shut
themselves down, usually with an UnknownScannerException. For instance, this
weekend, I discovered that all four had shut down, with messages like the
following in the logs:
2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink 129.215.197.39:50010
2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Abandoning block
blk_-5829206400135277905_3045
2008-09-29 07:29:16,552 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP
2008-09-29 07:46:35,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
4 on 60020, call next(-1347145425990165691) from 129.215.197.39:6999: error:
org.apache.hadoop.hbase.UnknownScannerException: Name: -1347145425990165691
The underlying hdfs seems fine - fsck reports the hbase directory as healthy.
After a restart hbase seems fine too, but surely the regionservers should
stay up once they're started,
Any suggestions?
regards
Barry
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Re: Region servers shut down with UnknownScannerException
Posted by Barry Haddow <bh...@inf.ed.ac.uk>.
Thanks for the suggestions - responses inline.
On Monday 29 September 2008 18:15:53 you wrote:
> Barry:
>
> From the below, looks like an issue in HDFS. If regionserver is
> having issues talking to HDFS, it shuts itself down.
>
> Tell us more. Are there other, heavy-duty processes running on the same
> servers hosting datanodes and regionservers?
Yes, there are heavy duty processes running on the same servers. This is
unavoidable as we need the cluster for other tasks.
>
> Enable DEBUG on your cluster and makes sure you've set your ulimit file
> descriptors up from default. See the FAQ in wiki for how to do both.
Which faq are you referring to? I've set both hadoop and hbase to debug, and
restarted. The fd limit is 8192. What should I be looking for and in which
logs?
Can I tune hbase so it is more tolerant of hdfs issues?
regards
Barry
>
> Thanks,
> St.Ack
>
> Barry Haddow wrote:
> > Hi
> >
> > I recently set up a small hbase cluster (v 0.18) running on top of hadoop
> > v.0.18.1. However I'm observing that the region servers spontaneously
> > shut themselves down, usually with an UnknownScannerException. For
> > instance, this weekend, I discovered that all four had shut down, with
> > messages like the following in the logs:
> >
> > 2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Exception
> > in createBlockOutputStream java.io.IOException: Bad connect ack with
> > firstBadLink 129.215.197.39:50010
> > 2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Abandoning
> > block blk_-5829206400135277905_3045
> > 2008-09-29 07:29:16,552 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > MSG_CALL_SERVER_STARTUP 2008-09-29 07:46:35,796 INFO
> > org.apache.hadoop.ipc.Server: IPC Server handler 4 on 60020, call
> > next(-1347145425990165691) from 129.215.197.39:6999: error:
> > org.apache.hadoop.hbase.UnknownScannerException: Name:
> > -1347145425990165691
> >
> >
> > The underlying hdfs seems fine - fsck reports the hbase directory as
> > healthy. After a restart hbase seems fine too, but surely the
> > regionservers should stay up once they're started,
> >
> > Any suggestions?
> >
> > regards
> > Barry
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Re: Region servers shut down with UnknownScannerException
Posted by stack <st...@duboce.net>.
Barry:
From the below, looks like an issue in HDFS. If regionserver is
having issues talking to HDFS, it shuts itself down.
Tell us more. Are there other, heavy-duty processes running on the same
servers hosting datanodes and regionservers?
Enable DEBUG on your cluster and makes sure you've set your ulimit file
descriptors up from default. See the FAQ in wiki for how to do both.
Thanks,
St.Ack
Barry Haddow wrote:
> Hi
>
> I recently set up a small hbase cluster (v 0.18) running on top of hadoop
> v.0.18.1. However I'm observing that the region servers spontaneously shut
> themselves down, usually with an UnknownScannerException. For instance, this
> weekend, I discovered that all four had shut down, with messages like the
> following in the logs:
>
> 2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Exception in
> createBlockOutputStream java.io.IOException: Bad connect ack with
> firstBadLink 129.215.197.39:50010
> 2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Abandoning block
> blk_-5829206400135277905_3045
> 2008-09-29 07:29:16,552 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP
> 2008-09-29 07:46:35,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 4 on 60020, call next(-1347145425990165691) from 129.215.197.39:6999: error:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -1347145425990165691
>
>
> The underlying hdfs seems fine - fsck reports the hbase directory as healthy.
> After a restart hbase seems fine too, but surely the regionservers should
> stay up once they're started,
>
> Any suggestions?
>
> regards
> Barry
>
>