You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by bijieshan <bi...@huawei.com> on 2011/04/14 09:51:58 UTC
Zookeeper exception leading to the shutdown of HBase
Hi,
I found this problem when the HBase cluster was running,here the logs information:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2011-03-21 13:26:39,697 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server t1/157.5.111.11:2181
2011-03-21 13:26:39,698 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to t1/157.5.111.11:2181, initiating session
2011-03-21 13:26:53,035 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 13336ms for sessionid 0x22e8e6ee15f0046, closing socket connection and attempting reconnect
2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22e8e6ee15f0046 Unable to get data of znode /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
2011-03-21 13:26:53,137 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x22e8e6ee15f0046 Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
2011-03-21 13:26:53,138 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception reading unassigned node data
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
2011-03-21 13:26:53,138 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
When I restart the cluster,the problem is still exist(Due to the unnormally Zookeeper process):
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
11-03-21 14:43:27,565 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=t2:2181,t1:2181,t0:2181 sessionTimeout=180000 watcher=master:60000
2011-03-21 14:43:27,573 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server t1/157.5.111.11:2181
2011-03-21 14:43:27,582 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to t1/157.5.111.11:2181, initiating session
2011-03-21 14:44:27,586 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 60003ms for sessionid 0x0, closing socket connection and attempting reconnect
2011-03-21 14:44:27,699 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1071)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1085)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:648)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:219)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1066)
... 5 more
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
This problem is most similar to the phenomenon described in the issue of:
https://issues.apache.org/jira/browse/HBASE-3062
And the bug has been fixed in the version of HBase 0.90.1.
Please help to analysis the problem.Thank you.
Expecting to the response.
Regards,
Jieshan
Re: Zookeeper exception leading to the shutdown of HBase
Posted by Jean-Daniel Cryans <jd...@apache.org>.
In the first case there clearly is a pause of 13 seconds, and in the
second case it talks of a 60 secs lapse of time when the master's
zookeeper client wasn't able to talk to the zookeeper server. As far
as I can tell there's something weird going on in your environment
(network issues maybe?).
J-D
On Thu, Apr 14, 2011 at 12:51 AM, bijieshan <bi...@huawei.com> wrote:
> Hi,
> I found this problem when the HBase cluster was running,here the logs information:
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2011-03-21 13:26:39,697 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server t1/157.5.111.11:2181
> 2011-03-21 13:26:39,698 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to t1/157.5.111.11:2181, initiating session
> 2011-03-21 13:26:53,035 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 13336ms for sessionid 0x22e8e6ee15f0046, closing socket connection and attempting reconnect
> 2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22e8e6ee15f0046 Unable to get data of znode /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
> at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
> at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
> at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,137 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x22e8e6ee15f0046 Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
> at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
> at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
> at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception reading unassigned node data
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
> at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
> at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
> at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> When I restart the cluster,the problem is still exist(Due to the unnormally Zookeeper process):
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 11-03-21 14:43:27,565 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=t2:2181,t1:2181,t0:2181 sessionTimeout=180000 watcher=master:60000
> 2011-03-21 14:43:27,573 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server t1/157.5.111.11:2181
> 2011-03-21 14:43:27,582 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to t1/157.5.111.11:2181, initiating session
> 2011-03-21 14:44:27,586 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 60003ms for sessionid 0x0, closing socket connection and attempting reconnect
> 2011-03-21 14:44:27,699 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
> at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1071)
> at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
> at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1085)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:648)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
> at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
> at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:219)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1066)
> ... 5 more
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> This problem is most similar to the phenomenon described in the issue of:
> https://issues.apache.org/jira/browse/HBASE-3062
> And the bug has been fixed in the version of HBase 0.90.1.
> Please help to analysis the problem.Thank you.
> Expecting to the response.
>
> Regards,
> Jieshan
>
>