You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Nichole Treadway <kn...@gmail.com> on 2011/03/11 17:49:47 UTC

HMaster fails to start up, Failed construction of Master exception

Last night I was putting pretty heavy load on my HBase cluster. One of the
region servers shut down unexpectedly, and I restarted the regionserver, but
HBase still wasn't assigning regions to it. I attempted to move regions
using the HBase shell but regions were still not being assigned to it. In
the past when this has happened, I've just restarted HBase and it's been
fine. I attempted to do this, but now HBase is failing to start up at all.

In my HMaster logs, here's the message I'm getting.

2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to myip1/myip1:2181, initiating session

2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to read
additional data from server sessionid 0x0, likely server has closed socket,
closing socket connection and attempting reconnect

2011-03-11 11:31:04,107 ERROR
org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master

java.lang.RuntimeException: Failed construction of Master: class
org.apache.hadoop.hbase.master.HMaster

        at
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)

        at
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)


        at
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)


        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)


        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)

Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase

        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:90)

        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)

        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)

        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)


        at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)


        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)


        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)


        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)

        ... 5 more


-------------------


Errors I'm seeing in the Zookeeper logs:


2011-03-11 11:30:47,479 WARN org.apache.zookeeper.server.quorum.Learner:
Unexpected exception, tries=0, connecting to /myip:2888

java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)

        at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at
org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)


        at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)

        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)



2011-03-11 11:32:37,091 WARN
org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
waiting for message on queue java.lang.InterruptedException

        at
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)

        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)

        at
java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)

        at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)



2011-03-11 11:32:18,671 ERROR
org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send last
message. Shutting down thread.java.nio.channels.AsynchronousCloseException

        at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)

        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)

        at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)

        at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
)

Re: HMaster fails to start up, Failed construction of Master exception

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I think the issue that your region server first suffered from may be
related, you might want to investigate that.

J-D

On Fri, Mar 11, 2011 at 11:38 AM, Nichole Treadway <kn...@gmail.com> wrote:
> Alright, I think I've got it working now. I increased the HBASE_HEAPSIZE
> value in hbase-env.sh and the HMaster finally started up and it looks like
> its working as normal now.
> I'm not really sure what caused this problem in the first place though since
> I've never encountered this problem before.
> My cells aren't fat but my table is very large, ~400 columns, two column
> families.
> Thank you for your help.
> On Fri, Mar 11, 2011 at 2:28 PM, Nichole Treadway <kn...@gmail.com>
> wrote:
>>
>> Sorry for not including that information in my original email.
>> Cluster Info:
>> I'm running the hadoop-0.20-append branch and HBase 0.90.1, and java 1.6.
>> All machines are 64-bit running Red Hat 5.5.
>>
>> I have a small cluster of 4 nodes all acting as datanodes and
>> regionservers. Replication in my cluster is set to 3.
>> As an update, I removed all regionservers except my master from the
>> regionservers list and from the zookeeper quorum list in hbase-site.xml. I
>> started up HBase again and was no longer seeing the "Failed Construction of
>> Master" errors I mentioned in my previous email. HMaster started up more
>> normally this time and began reading HLog files. It then printed a message
>> about not being able to contact some of my regionservers and quit again.
>> I added all the regionservers back again to the regionservers list and the
>> zookeeper qurom list. Now the master starts up, spends several minutes
>> printing messages about HLog files, and then fails again with the following
>> error:
>> 2011-03-11 14:17:58,197 FATAL org.apache.hadoop.hbase.master.HMaster:
>> Unhandled exception. Starting shutdown.
>> java.lang.OutOfMemoryError: Java heap space
>> at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1970)
>> at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1977)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:118)
>> at
>> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1758)
>> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1886)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:198)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:172)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter.java:429)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:262)
>> at
>> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:188)
>> at
>> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:196)
>> at
>> org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:180)
>> at
>> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
>> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
>> On Fri, Mar 11, 2011 at 1:20 PM, Jean-Daniel Cryans <jd...@apache.org>
>> wrote:
>>>
>>> Please include relevant basic information when asking that sort of
>>> question, such as hbase/hadoop version, hardware, OS, java version,
>>> cluster setup, etc.
>>>
>>> The exceptions seems to indicate that it's having a hard time getting
>>> data from zookeeper? Have you checked the zookeeper log(s)?
>>>
>>> Maybe that's a red herring tho, but without any context those lines of
>>> log could mean anything.
>>>
>>> J-D
>>>
>>> On Fri, Mar 11, 2011 at 8:49 AM, Nichole Treadway <kn...@gmail.com>
>>> wrote:
>>> > Last night I was putting pretty heavy load on my HBase cluster. One of
>>> > the
>>> > region servers shut down unexpectedly, and I restarted the
>>> > regionserver, but
>>> > HBase still wasn't assigning regions to it. I attempted to move regions
>>> > using the HBase shell but regions were still not being assigned to it.
>>> > In
>>> > the past when this has happened, I've just restarted HBase and it's
>>> > been
>>> > fine. I attempted to do this, but now HBase is failing to start up at
>>> > all.
>>> >
>>> > In my HMaster logs, here's the message I'm getting.
>>> >
>>> > 2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
>>> > connection established to myip1/myip1:2181, initiating session
>>> >
>>> > 2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to
>>> > read
>>> > additional data from server sessionid 0x0, likely server has closed
>>> > socket,
>>> > closing socket connection and attempting reconnect
>>> >
>>> > 2011-03-11 11:31:04,107 ERROR
>>> > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start
>>> > master
>>> >
>>> > java.lang.RuntimeException: Failed construction of Master: class
>>> > org.apache.hadoop.hbase.master.HMaster
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>>> >
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>>> >
>>> >
>>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>>> >
>>> >
>>> >        at
>>> > org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)
>>> >
>>> > Caused by:
>>> > org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> > KeeperErrorCode = ConnectionLoss for /hbase
>>> >
>>> >        at
>>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>> >
>>> >        at
>>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>> >
>>> >        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>>> >
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>>> >
>>> >
>>> >        at
>>> > org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)
>>> >
>>> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> > Method)
>>> >
>>> >        at
>>> >
>>> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>> >
>>> >
>>> >        at
>>> >
>>> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>> >
>>> >
>>> >        at
>>> > java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>> >
>>> >        at
>>> >
>>> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)
>>> >
>>> >        ... 5 more
>>> >
>>> >
>>> > -------------------
>>> >
>>> >
>>> > Errors I'm seeing in the Zookeeper logs:
>>> >
>>> >
>>> > 2011-03-11 11:30:47,479 WARN
>>> > org.apache.zookeeper.server.quorum.Learner:
>>> > Unexpected exception, tries=0, connecting to /myip:2888
>>> >
>>> > java.net.ConnectException: Connection refused
>>> >
>>> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> >
>>> >        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>> >
>>> >        at
>>> > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>> >
>>> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>> >
>>> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>> >
>>> >        at java.net.Socket.connect(Socket.java:529)
>>> >
>>> >        at
>>> >
>>> > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)
>>> >
>>> >
>>> >        at
>>> >
>>> > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)
>>> >
>>> >        at
>>> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
>>> >
>>> >
>>> >
>>> > 2011-03-11 11:32:37,091 WARN
>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
>>> > waiting for message on queue java.lang.InterruptedException
>>> >
>>> >        at
>>> >
>>> >  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
>>> >
>>> >        at
>>> >
>>> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
>>> >
>>> >        at
>>> >
>>> > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
>>> >
>>> >        at
>>> >
>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)
>>> >
>>> >
>>> >
>>> > 2011-03-11 11:32:18,671 ERROR
>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send
>>> > last
>>> > message. Shutting down
>>> > thread.java.nio.channels.AsynchronousCloseException
>>> >
>>> >        at
>>> >
>>> > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
>>> >
>>> >        at
>>> > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >
>>> >        at
>>> >
>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)
>>> >
>>> >        at
>>> >
>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
>>> > )
>>> >
>>
>
>

Re: HMaster fails to start up, Failed construction of Master exception

Posted by Nichole Treadway <kn...@gmail.com>.
Alright, I think I've got it working now. I increased the HBASE_HEAPSIZE
value in hbase-env.sh and the HMaster finally started up and it looks like
its working as normal now.

I'm not really sure what caused this problem in the first place though since
I've never encountered this problem before.

My cells aren't fat but my table is very large, ~400 columns, two column
families.

Thank you for your help.

On Fri, Mar 11, 2011 at 2:28 PM, Nichole Treadway <kn...@gmail.com>wrote:

> Sorry for not including that information in my original email.
>
> Cluster Info:
> I'm running the hadoop-0.20-append branch and HBase 0.90.1, and java 1.6.
> All machines are 64-bit running Red Hat 5.5.
>
> I have a small cluster of 4 nodes all acting as datanodes and
> regionservers. Replication in my cluster is set to 3.
>
> As an update, I removed all regionservers except my master from the
> regionservers list and from the zookeeper quorum list in hbase-site.xml. I
> started up HBase again and was no longer seeing the "Failed Construction of
> Master" errors I mentioned in my previous email. HMaster started up more
> normally this time and began reading HLog files. It then printed a message
> about not being able to contact some of my regionservers and quit again.
>
> I added all the regionservers back again to the regionservers list and the
> zookeeper qurom list. Now the master starts up, spends several minutes
> printing messages about HLog files, and then fails again with the following
> error:
>
> 2011-03-11 14:17:58,197 FATAL org.apache.hadoop.hbase.master.HMaster:
> Unhandled exception. Starting shutdown.
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1970)
>  at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1977)
> at
> org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:118)
>  at
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1758)
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1886)
>  at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:198)
> at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:172)
>  at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter.java:429)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:262)
>  at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:188)
> at
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:196)
>  at
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:180)
> at
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
>  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
>
> On Fri, Mar 11, 2011 at 1:20 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Please include relevant basic information when asking that sort of
>> question, such as hbase/hadoop version, hardware, OS, java version,
>> cluster setup, etc.
>>
>> The exceptions seems to indicate that it's having a hard time getting
>> data from zookeeper? Have you checked the zookeeper log(s)?
>>
>> Maybe that's a red herring tho, but without any context those lines of
>> log could mean anything.
>>
>> J-D
>>
>> On Fri, Mar 11, 2011 at 8:49 AM, Nichole Treadway <kn...@gmail.com>
>> wrote:
>> > Last night I was putting pretty heavy load on my HBase cluster. One of
>> the
>> > region servers shut down unexpectedly, and I restarted the regionserver,
>> but
>> > HBase still wasn't assigning regions to it. I attempted to move regions
>> > using the HBase shell but regions were still not being assigned to it.
>> In
>> > the past when this has happened, I've just restarted HBase and it's been
>> > fine. I attempted to do this, but now HBase is failing to start up at
>> all.
>> >
>> > In my HMaster logs, here's the message I'm getting.
>> >
>> > 2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
>> > connection established to myip1/myip1:2181, initiating session
>> >
>> > 2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to
>> read
>> > additional data from server sessionid 0x0, likely server has closed
>> socket,
>> > closing socket connection and attempting reconnect
>> >
>> > 2011-03-11 11:31:04,107 ERROR
>> > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start
>> master
>> >
>> > java.lang.RuntimeException: Failed construction of Master: class
>> > org.apache.hadoop.hbase.master.HMaster
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>> >
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>> >
>> >
>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>> >
>> >
>> >        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)
>> >
>> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>> > KeeperErrorCode = ConnectionLoss for /hbase
>> >
>> >        at
>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>> >
>> >        at
>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> >
>> >        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>> >
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>> >
>> >
>> >        at
>> org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)
>> >
>> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> > Method)
>> >
>> >        at
>> >
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> >
>> >
>> >        at
>> >
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> >
>> >
>> >        at
>> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)
>> >
>> >        ... 5 more
>> >
>> >
>> > -------------------
>> >
>> >
>> > Errors I'm seeing in the Zookeeper logs:
>> >
>> >
>> > 2011-03-11 11:30:47,479 WARN org.apache.zookeeper.server.quorum.Learner:
>> > Unexpected exception, tries=0, connecting to /myip:2888
>> >
>> > java.net.ConnectException: Connection refused
>> >
>> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
>> >
>> >        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> >
>> >        at
>> > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>> >
>> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> >
>> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> >
>> >        at java.net.Socket.connect(Socket.java:529)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)
>> >
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)
>> >
>> >        at
>> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
>> >
>> >
>> >
>> > 2011-03-11 11:32:37,091 WARN
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
>> > waiting for message on queue java.lang.InterruptedException
>> >
>> >        at
>> >
>>  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
>> >
>> >        at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
>> >
>> >        at
>> >
>> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)
>> >
>> >
>> >
>> > 2011-03-11 11:32:18,671 ERROR
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send last
>> > message. Shutting down
>> thread.java.nio.channels.AsynchronousCloseException
>> >
>> >        at
>> >
>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
>> >
>> >        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
>> > )
>> >
>>
>
>

Re: HMaster fails to start up, Failed construction of Master exception

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Do you have very fat cells? There was thread with the same messages
yesterday http://search-hadoop.com/m/sgKrn1YtTKc1

This was opened as a result https://issues.apache.org/jira/browse/HBASE-3619

How much heap did you give to the master?

J-D

On Fri, Mar 11, 2011 at 11:28 AM, Nichole Treadway <kn...@gmail.com> wrote:
> Sorry for not including that information in my original email.
>
> Cluster Info:
> I'm running the hadoop-0.20-append branch and HBase 0.90.1, and java 1.6.
> All machines are 64-bit running Red Hat 5.5.
>
> I have a small cluster of 4 nodes all acting as datanodes and regionservers.
> Replication in my cluster is set to 3.
>
> As an update, I removed all regionservers except my master from the
> regionservers list and from the zookeeper quorum list in hbase-site.xml. I
> started up HBase again and was no longer seeing the "Failed Construction of
> Master" errors I mentioned in my previous email. HMaster started up more
> normally this time and began reading HLog files. It then printed a message
> about not being able to contact some of my regionservers and quit again.
>
> I added all the regionservers back again to the regionservers list and the
> zookeeper qurom list. Now the master starts up, spends several minutes
> printing messages about HLog files, and then fails again with the following
> error:
>
> 2011-03-11 14:17:58,197 FATAL org.apache.hadoop.hbase.master.HMaster:
> Unhandled exception. Starting shutdown.
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1970)
>  at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1977)
> at
> org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:118)
>  at
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1758)
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1886)
>  at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:198)
> at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:172)
>  at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter.java:429)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:262)
>  at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:188)
> at
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:196)
>  at
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:180)
> at
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
>  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
>
> On Fri, Mar 11, 2011 at 1:20 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Please include relevant basic information when asking that sort of
>> question, such as hbase/hadoop version, hardware, OS, java version,
>> cluster setup, etc.
>>
>> The exceptions seems to indicate that it's having a hard time getting
>> data from zookeeper? Have you checked the zookeeper log(s)?
>>
>> Maybe that's a red herring tho, but without any context those lines of
>> log could mean anything.
>>
>> J-D
>>
>> On Fri, Mar 11, 2011 at 8:49 AM, Nichole Treadway <kn...@gmail.com>
>> wrote:
>> > Last night I was putting pretty heavy load on my HBase cluster. One of
>> the
>> > region servers shut down unexpectedly, and I restarted the regionserver,
>> but
>> > HBase still wasn't assigning regions to it. I attempted to move regions
>> > using the HBase shell but regions were still not being assigned to it. In
>> > the past when this has happened, I've just restarted HBase and it's been
>> > fine. I attempted to do this, but now HBase is failing to start up at
>> all.
>> >
>> > In my HMaster logs, here's the message I'm getting.
>> >
>> > 2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
>> > connection established to myip1/myip1:2181, initiating session
>> >
>> > 2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to
>> read
>> > additional data from server sessionid 0x0, likely server has closed
>> socket,
>> > closing socket connection and attempting reconnect
>> >
>> > 2011-03-11 11:31:04,107 ERROR
>> > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
>> >
>> > java.lang.RuntimeException: Failed construction of Master: class
>> > org.apache.hadoop.hbase.master.HMaster
>> >
>> >        at
>> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>> >
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>> >
>> >
>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>> >
>> >
>> >        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)
>> >
>> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>> > KeeperErrorCode = ConnectionLoss for /hbase
>> >
>> >        at
>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>> >
>> >        at
>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> >
>> >        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>> >
>> >
>> >        at
>> >
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>> >
>> >
>> >        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)
>> >
>> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> > Method)
>> >
>> >        at
>> >
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> >
>> >
>> >        at
>> >
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> >
>> >
>> >        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> >
>> >        at
>> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)
>> >
>> >        ... 5 more
>> >
>> >
>> > -------------------
>> >
>> >
>> > Errors I'm seeing in the Zookeeper logs:
>> >
>> >
>> > 2011-03-11 11:30:47,479 WARN org.apache.zookeeper.server.quorum.Learner:
>> > Unexpected exception, tries=0, connecting to /myip:2888
>> >
>> > java.net.ConnectException: Connection refused
>> >
>> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
>> >
>> >        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> >
>> >        at
>> > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>> >
>> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> >
>> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> >
>> >        at java.net.Socket.connect(Socket.java:529)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)
>> >
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)
>> >
>> >        at
>> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
>> >
>> >
>> >
>> > 2011-03-11 11:32:37,091 WARN
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
>> > waiting for message on queue java.lang.InterruptedException
>> >
>> >        at
>> >
>>  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
>> >
>> >        at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
>> >
>> >        at
>> > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)
>> >
>> >
>> >
>> > 2011-03-11 11:32:18,671 ERROR
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send last
>> > message. Shutting down
>> thread.java.nio.channels.AsynchronousCloseException
>> >
>> >        at
>> >
>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
>> >
>> >        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)
>> >
>> >        at
>> >
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
>> > )
>> >
>>
>

Re: HMaster fails to start up, Failed construction of Master exception

Posted by Nichole Treadway <kn...@gmail.com>.
Sorry for not including that information in my original email.

Cluster Info:
I'm running the hadoop-0.20-append branch and HBase 0.90.1, and java 1.6.
All machines are 64-bit running Red Hat 5.5.

I have a small cluster of 4 nodes all acting as datanodes and regionservers.
Replication in my cluster is set to 3.

As an update, I removed all regionservers except my master from the
regionservers list and from the zookeeper quorum list in hbase-site.xml. I
started up HBase again and was no longer seeing the "Failed Construction of
Master" errors I mentioned in my previous email. HMaster started up more
normally this time and began reading HLog files. It then printed a message
about not being able to contact some of my regionservers and quit again.

I added all the regionservers back again to the regionservers list and the
zookeeper qurom list. Now the master starts up, spends several minutes
printing messages about HLog files, and then fails again with the following
error:

2011-03-11 14:17:58,197 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1970)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1977)
at
org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:118)
 at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1758)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1886)
 at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:198)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:172)
 at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter.java:429)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:262)
 at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:188)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:196)
 at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:180)
at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)

On Fri, Mar 11, 2011 at 1:20 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Please include relevant basic information when asking that sort of
> question, such as hbase/hadoop version, hardware, OS, java version,
> cluster setup, etc.
>
> The exceptions seems to indicate that it's having a hard time getting
> data from zookeeper? Have you checked the zookeeper log(s)?
>
> Maybe that's a red herring tho, but without any context those lines of
> log could mean anything.
>
> J-D
>
> On Fri, Mar 11, 2011 at 8:49 AM, Nichole Treadway <kn...@gmail.com>
> wrote:
> > Last night I was putting pretty heavy load on my HBase cluster. One of
> the
> > region servers shut down unexpectedly, and I restarted the regionserver,
> but
> > HBase still wasn't assigning regions to it. I attempted to move regions
> > using the HBase shell but regions were still not being assigned to it. In
> > the past when this has happened, I've just restarted HBase and it's been
> > fine. I attempted to do this, but now HBase is failing to start up at
> all.
> >
> > In my HMaster logs, here's the message I'm getting.
> >
> > 2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to myip1/myip1:2181, initiating session
> >
> > 2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to
> read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> >
> > 2011-03-11 11:31:04,107 ERROR
> > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> >
> > java.lang.RuntimeException: Failed construction of Master: class
> > org.apache.hadoop.hbase.master.HMaster
> >
> >        at
> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)
> >
> >        at
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
> >
> >
> >        at
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
> >
> >
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >
> >        at
> >
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> >
> >
> >        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)
> >
> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /hbase
> >
> >        at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> >
> >        at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> >
> >        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> >
> >        at
> >
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
> >
> >
> >        at
> >
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
> >
> >
> >        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)
> >
> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >
> >        at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> >
> >
> >        at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> >
> >
> >        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> >
> >        at
> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)
> >
> >        ... 5 more
> >
> >
> > -------------------
> >
> >
> > Errors I'm seeing in the Zookeeper logs:
> >
> >
> > 2011-03-11 11:30:47,479 WARN org.apache.zookeeper.server.quorum.Learner:
> > Unexpected exception, tries=0, connecting to /myip:2888
> >
> > java.net.ConnectException: Connection refused
> >
> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
> >
> >        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> >
> >        at
> > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> >
> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> >
> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> >
> >        at java.net.Socket.connect(Socket.java:529)
> >
> >        at
> >
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)
> >
> >
> >        at
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)
> >
> >        at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
> >
> >
> >
> > 2011-03-11 11:32:37,091 WARN
> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
> > waiting for message on queue java.lang.InterruptedException
> >
> >        at
> >
>  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
> >
> >        at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
> >
> >        at
> > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
> >
> >        at
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)
> >
> >
> >
> > 2011-03-11 11:32:18,671 ERROR
> > org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send last
> > message. Shutting down
> thread.java.nio.channels.AsynchronousCloseException
> >
> >        at
> >
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
> >
> >        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
> >
> >        at
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)
> >
> >        at
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
> > )
> >
>

Re: HMaster fails to start up, Failed construction of Master exception

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Please include relevant basic information when asking that sort of
question, such as hbase/hadoop version, hardware, OS, java version,
cluster setup, etc.

The exceptions seems to indicate that it's having a hard time getting
data from zookeeper? Have you checked the zookeeper log(s)?

Maybe that's a red herring tho, but without any context those lines of
log could mean anything.

J-D

On Fri, Mar 11, 2011 at 8:49 AM, Nichole Treadway <kn...@gmail.com> wrote:
> Last night I was putting pretty heavy load on my HBase cluster. One of the
> region servers shut down unexpectedly, and I restarted the regionserver, but
> HBase still wasn't assigning regions to it. I attempted to move regions
> using the HBase shell but regions were still not being assigned to it. In
> the past when this has happened, I've just restarted HBase and it's been
> fine. I attempted to do this, but now HBase is failing to start up at all.
>
> In my HMaster logs, here's the message I'm getting.
>
> 2011-03-11 11:30:51,014 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to myip1/myip1:2181, initiating session
>
> 2011-03-11 11:31:04,004 INFO org.apache.zookeeper.ClientCnxn: Unable to read
> additional data from server sessionid 0x0, likely server has closed socket,
> closing socket connection and attempting reconnect
>
> 2011-03-11 11:31:04,107 ERROR
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
>
> java.lang.RuntimeException: Failed construction of Master: class
> org.apache.hadoop.hbase.master.HMaster
>
>        at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1064)
>
>        at
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>
>
>        at
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>
>
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>        at
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>
>
>        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1078)
>
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
>
>        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>
>        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>
>        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>
>        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>
>
>        at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>
>
>        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:218)
>
>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
>        at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>
>
>        at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>
>
>        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>
>        at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1059)
>
>        ... 5 more
>
>
> -------------------
>
>
> Errors I'm seeing in the Zookeeper logs:
>
>
> 2011-03-11 11:30:47,479 WARN org.apache.zookeeper.server.quorum.Learner:
> Unexpected exception, tries=0, connecting to /myip:2888
>
> java.net.ConnectException: Connection refused
>
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>
>        at
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>
>        at java.net.Socket.connect(Socket.java:529)
>
>        at
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:212)
>
>
>        at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:65)
>
>        at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
>
>
>
> 2011-03-11 11:32:37,091 WARN
> org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while
> waiting for message on queue java.lang.InterruptedException
>
>        at
>  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
>
>        at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
>
>        at
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
>
>        at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:601)
>
>
>
> 2011-03-11 11:32:18,671 ERROR
> org.apache.zookeeper.server.quorum.QuorumCnxManager: Failed to send last
> message. Shutting down thread.java.nio.channels.AsynchronousCloseException
>
>        at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
>
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>
>        at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:579)
>
>        at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:588
> )
>