You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Dhaval Shah <pr...@yahoo.co.in> on 2012/12/29 14:15:06 UTC

Re:: Hbase master failing to start after reaching 95% disk and expanding cluster

Seems like your zookeeper quorum is in a bad state (possibly because it ran out of disk in the past). Can you try stopping hbase, removing the zookeeper znode for hbase and then starting hbase.


------------------------------
On Sat 29 Dec, 2012 5:14 PM IST Marco Gallotta wrote:

>Hi there 
>
>I've been running an hbase cluster for several months, and it recently experienced problems as the nodes reached 95% disk capacity. I added an extra node, and now the master keeps crashing with the errors below. I also increased the disk capacity on each individual node after this, and the errors are the same. I tried removing the new node, and that doesn't help.
>
>There are similar errors in the regionserver and zookeeper logs, but the all seem to echo from the master logs.
>
>Anything I can look at to help diagnose what the problem here is?
>
>hbase-root-master-analytics.log:
>Sat Dec 29 03:14:22 PST 2012 Starting master on analytics
>core file size          (blocks, -c) 0
>data seg size           (kbytes, -d) unlimited
>scheduling priority             (-e) 0
>file size               (blocks, -f) unlimited
>pending signals                 (-i) 59480
>max locked memory       (kbytes, -l) 64
>max memory size         (kbytes, -m) unlimited
>open files                      (-n) 1024
>pipe size            (512 bytes, -p) 8
>POSIX message queues     (bytes, -q) 819200
>real-time priority              (-r) 0
>stack size              (kbytes, -s) 8192
>cpu time               (seconds, -t) unlimited
>max user processes              (-u) 59480
>virtual memory          (kbytes, -v) unlimited
>file locks                      (-x) unlimited
>2012-12-29 03:14:24,601 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,614 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,622 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,631 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,636 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,643 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,651 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,665 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,675 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,698 INFO org.apache.hadoop.ipc.HBaseServer: Starting IPC Server listener on 60000
>2012-12-29 03:14:25,322 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:28,735 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>
>2012-12-29 03:14:32,797 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:41,427 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:41,427 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
>2012-12-29 03:14:41,428 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
>java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
>
>        at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1740)
>        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
>        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1754)
>Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
>        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049)
>        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:176)
>        at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:896)
>        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
>        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:154)
>        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:281)
>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>        at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
>        at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1735)
>        ... 5 more
>
>
>-- 
>Marco Gallotta | Mountain View, California
>Software Engineer, Infrastructure | Loki Studios
>fb.me/marco.gallotta | twitter.com/marcog
>marco@gallotta.co.za | +1 (650) 417-3313
>
>Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>