You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Kristoffer Sjögren <st...@gmail.com> on 2012/08/09 11:05:43 UTC

Zookeeper: KeeperErrorCode NoNode for /hbase/backup-masters

Hi all

I have a problem starting hbase in a fully distributed 3 machine setup (2
datanodes/regionservers + 1 master/namenode). For some reason zookeeper on
master complains about not finding /hbase/backup-masters in
hbase-user-zookeeper-host.out.

java.io.IOException: Failed to process transaction type: 1 error:
KeeperErrorCode = NoNode for /hbase/backup-masters
    at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
    at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
    at
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
    at
org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
    at
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
    at
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
    at
org.apache.hadoop.hbase.zookeeper.HQuorumPeer.runZKServer(HQuorumPeer.java:78)
    at
org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:63)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /hbase/backup-masters
    at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:209)
    at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:149)

hbase-user-zookeeper-host.log further complains about not finding the
transaction log.

2012-08-09 08:46:30,924 ERROR
org.apache.zookeeper.server.persistence.FileTxnSnapLog: Parent
/hbase/backup-masters missing for /hbase/backup-masters/vhp11.aphelion.se
,60000,1343122915296

This prevents master and regionservers to form quorum on master's port 2181:

2012-08-09 08:46:48,807 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server vhp11.aphelion.se/192.168.1.250:2181
2012-08-09 08:46:48,808 WARN
org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
java.lang.SecurityException: Unable to locate a login configuration
occurred when trying to find JAAS configuration.
2012-08-09 08:46:48,808 INFO
org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
SASL-authenticate because the default JAAS configuration section 'Client'
could not be found. If you are not using SASL, you may ignore this. On the
other hand, if you expected SASL to work, please fix your JAAS
configuration.
2012-08-09 08:46:48,808 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
2012-08-09 08:46:48,910 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
2012-08-09 08:46:48,911 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists
failed after 3 retries
2012-08-09 08:46:48,911 ERROR
org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: Failed construction of Master: class
org.apache.hadoop.hbase.master.HMaster
    at
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1623)
    at
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:144)
    at
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
    at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1637)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049)
    at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:189)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:892)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:154)
    at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:274)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1618)
    ... 5 more

Not really sure how to proceed. Thankful for any help or pointers.

Configuration and logs files attached.

Cheers,
-Kristoffer

Re: Zookeeper: KeeperErrorCode NoNode for /hbase/backup-masters

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I'm not familiar with Zookeeper snapshot recovery errors, in fact I
don't think I've ever seen one, but looking over your hbase-site.xml I
see that you didn't change where ZK is storing its data so it means it
goes to /tmp. I guess it wouldn't be a stretch to say that some files
are gone and now your ZK data is in an inconsistent state.

The name of the configuration is kinda buried in the doc, it's here:
http://hbase.apache.org/book.html#zookeeper

Look for hbase.zookeeper.property.dataDir

Once that's changed, ZK should start normally from the new ZK data folder.

J-D

On Thu, Aug 9, 2012 at 2:05 AM, Kristoffer Sjögren <st...@gmail.com> wrote:
> Hi all
>
> I have a problem starting hbase in a fully distributed 3 machine setup (2
> datanodes/regionservers + 1 master/namenode). For some reason zookeeper on
> master complains about not finding /hbase/backup-masters in
> hbase-user-zookeeper-host.out.
>
> java.io.IOException: Failed to process transaction type: 1 error:
> KeeperErrorCode = NoNode for /hbase/backup-masters
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
>     at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
>     at
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
>     at
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
>     at
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
>     at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
>     at
> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.runZKServer(HQuorumPeer.java:78)
>     at
> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:63)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /hbase/backup-masters
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:209)
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:149)
>
> hbase-user-zookeeper-host.log further complains about not finding the
> transaction log.
>
> 2012-08-09 08:46:30,924 ERROR
> org.apache.zookeeper.server.persistence.FileTxnSnapLog: Parent
> /hbase/backup-masters missing for
> /hbase/backup-masters/vhp11.aphelion.se,60000,1343122915296
>
> This prevents master and regionservers to form quorum on master's port 2181:
>
> 2012-08-09 08:46:48,807 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> connection to server vhp11.aphelion.se/192.168.1.250:2181
> 2012-08-09 08:46:48,808 WARN
> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
> java.lang.SecurityException: Unable to locate a login configuration occurred
> when trying to find JAAS configuration.
> 2012-08-09 08:46:48,808 INFO
> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
> SASL-authenticate because the default JAAS configuration section 'Client'
> could not be found. If you are not using SASL, you may ignore this. On the
> other hand, if you expected SASL to work, please fix your JAAS
> configuration.
> 2012-08-09 08:46:48,808 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>     at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
> 2012-08-09 08:46:48,910 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
> 2012-08-09 08:46:48,911 ERROR
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists
> failed after 3 retries
> 2012-08-09 08:46:48,911 ERROR
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: Failed construction of Master: class
> org.apache.hadoop.hbase.master.HMaster
>     at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1623)
>     at
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:144)
>     at
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1637)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049)
>     at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:189)
>     at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:892)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:154)
>     at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:274)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>     at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1618)
>     ... 5 more
>
> Not really sure how to proceed. Thankful for any help or pointers.
>
> Configuration and logs files attached.
>
> Cheers,
> -Kristoffer
>
>