You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ron van der Vegt <ro...@openindex.io> on 2014/08/25 13:51:15 UTC
Could not get distributed Hbase stable, stopping every ~24 hours.
Hi all!
I have setup one master and 5 regionservers to collect log data. But every ~24 hours, at random times, the regionservers generating a fatal error and all stopping one by one. Eventually the master will stop. I also see some weird characters before the server names in the logs. Seems like some encoding issue.
I have read in the documentation, that if the garbage collection is taking to long, you will also get the session expired message. But I have logged the GC on the master, and it seems oke. Could someone help me figure out why this is happening?
Furthermore, I am currently monitoring the memory usage of the master with JMX. I notice that the heap size is slowly growing. Could there be a memory leakage?
xmx is set to 1gb.
Setup:
hbase 0.94.20
hadoop 1.2.1
debian wheezy
Thanks in advice,
Ron
Logs of master:
===============
2014-08-23 07:00:20,104 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/70236052
2014-08-23 07:00:20,406 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2060.directvps.nl,60020,1408691165501 reported a fatal error:
ABORTING region server vps2060.directvps.nl,60020,1408691165501: regionserver:60020-0x347fc15265a00eb-0x347fc15265a00eb-0x347fc15265a00eb regionserver:60020-0x347fc15265a00eb-0x347fc15265a00eb-0x347fc15265a00eb received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:20,911 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2057.directvps.nl,60020,1408691165499 reported a fatal error:
ABORTING region server vps2057.directvps.nl,60020,1408691165499: regionserver:60020-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea regionserver:60020-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:21,001 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2059.directvps.nl,60020,1408691165851 reported a fatal error:
ABORTING region server vps2059.directvps.nl,60020,1408691165851: regionserver:60020-0x147fc1616d200bb-0x147fc1616d200bb-0x147fc1616d200bb regionserver:60020-0x147fc1616d200bb-0x147fc1616d200bb-0x147fc1616d200bb received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:21,056 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2058.directvps.nl,60020,1408691165675 reported a fatal error:
ABORTING region server vps2058.directvps.nl,60020,1408691165675: regionserver:60020-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec regionserver:60020-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:22,140 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
2014-08-23 07:00:26,141 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
2014-08-23 07:00:34,114 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2056.directvps.nl,60020,1408691165439 reported a fatal error:
ABORTING region server vps2056.directvps.nl,60020,1408691165439: Unexpected exception handling nodeDeleted event
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:420)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:182)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:34,118 ERROR org.apache.hadoop.hbase.master.HMaster: Region server vps2056.directvps.nl,60020,1408691165439 reported a fatal error:
ABORTING region server vps2056.directvps.nl,60020,1408691165439: regionserver:60020-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2 regionserver:60020-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2 received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:34,141 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
2014-08-23 07:00:34,142 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after 3 retries
2014-08-23 07:00:34,152 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001 Unable to get data of znode /hbase/unassigned/70236052
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
at org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
at org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:34,152 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001 Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
at org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
at org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:00:34,163 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: []
2014-08-23 07:00:34,215 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/backup-masters/vps2008.directvps.nl,60000,1408691163492 already deleted, and this is not a retry
2014-08-23 07:05:34,165 WARN org.apache.hadoop.hbase.master.SplitLogManager: Interrupted while waiting for log splits to be completed
2014-08-23 07:05:34,179 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception reading unassigned node for region=70236052
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/70236052
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
at org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
at org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
at org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:05:34,179 WARN org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs in [hdfs://namenode.openindex.io:8020/hbase/.logs/vps2058.directvps.nl,60020,1408691165675-splitting] installed = 2 but only 0 done
2014-08-23 07:05:34,184 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 60000
2014-08-23 07:05:34,185 WARN org.apache.hadoop.hbase.master.CatalogJanitor: Failed scan of catalog table
java.io.IOException: Giving up after tries=1
at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:210)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:188)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:82)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:67)
at org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:126)
at org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:137)
at org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:93)
at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:207)
... 8 more
2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60000: exiting
2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60000: exiting
2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server handler 0 on 60000: exiting
2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000: exiting
2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60000: exiting
2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60000: exiting
2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60000: exiting
2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 8 on 60000: exiting
2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60000: exiting
2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server handler 2 on 60000: exiting
2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server listener on 60000
2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder
2014-08-23 07:05:34,214 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder
2014-08-23 07:05:34,212 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server handler 1 on 60000: exiting
2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60000: exiting
2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60000: exiting
2014-08-23 07:05:34,256 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:60010
2014-08-23 07:05:34,259 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: []
2014-08-23 07:05:34,260 FATAL org.apache.hadoop.hbase.master.HMaster: master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001-0x347fcb5a0130001 master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001-0x347fcb5a0130001 received expired from ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2014-08-23 07:05:34,414 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:160)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2129)
GC log:
=======
0.185: Application time: 0.1304320 seconds
0.185: [GC0.185: [ParNew: 4288K->511K(4800K), 0.0120520 secs] 4288K->1008K(15424K), 0.0121600 secs] [Times: user=0.01 sys=0.01, real=0.01 secs]
0.197: Total time for which application threads were stopped: 0.0126240 seconds
Heap
par new generation total 4800K, used 3580K [0x00000000b7200000, 0x00000000b7730000, 0x00000000c1860000)
eden space 4288K, 71% used [0x00000000b7200000, 0x00000000b74ff328, 0x00000000b7630000)
from space 512K, 99% used [0x00000000b76b0000, 0x00000000b772fff8, 0x00000000b7730000)
to space 512K, 0% used [0x00000000b7630000, 0x00000000b7630000, 0x00000000b76b0000)
concurrent mark-sweep generation total 10624K, used 496K [0x00000000c1860000, 0x00000000c22c0000, 0x00000000f5a00000)
concurrent-mark-sweep perm gen total 21248K, used 6688K [0x00000000f5a00000, 0x00000000f6ec0000, 0x0000000100000000)
0.370: Application time: 0.1728650 seconds
Re: Could not get distributed Hbase stable, stopping every ~24 hours.
Posted by Ted Yu <yu...@gmail.com>.
Have you checked zookeeper logs ?
What zookeeper release are you using ?
bq. Could there be a memory leakage?
You can use jmap to capture heap memory details:
http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
Cheers
On Mon, Aug 25, 2014 at 4:51 AM, Ron van der Vegt <
ron.van.der.vegt@openindex.io> wrote:
> Hi all!
>
> I have setup one master and 5 regionservers to collect log data. But every
> ~24 hours, at random times, the regionservers generating a fatal error and
> all stopping one by one. Eventually the master will stop. I also see some
> weird characters before the server names in the logs. Seems like some
> encoding issue.
>
> I have read in the documentation, that if the garbage collection is taking
> to long, you will also get the session expired message. But I have logged
> the GC on the master, and it seems oke. Could someone help me figure out
> why this is happening?
>
> Furthermore, I am currently monitoring the memory usage of the master with
> JMX. I notice that the heap size is slowly growing. Could there be a memory
> leakage?
> xmx is set to 1gb.
>
> Setup:
> hbase 0.94.20
> hadoop 1.2.1
> debian wheezy
>
> Thanks in advice,
>
> Ron
>
> Logs of master:
> ===============
>
> 2014-08-23 07:00:20,104 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/unassigned/70236052
> 2014-08-23 07:00:20,406 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2060.directvps.nl,60020,1408691165501 reported a fatal
> error:
> ABORTING region server vps2060.directvps.nl,60020,1408691165501:
> regionserver:60020-0x347fc15265a00eb-0x347fc15265a00eb-0x347fc15265a00eb
> regionserver:60020-0x347fc15265a00eb-0x347fc15265a00eb-0x347fc15265a00eb
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:20,911 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2057.directvps.nl,60020,1408691165499 reported a fatal
> error:
> ABORTING region server vps2057.directvps.nl,60020,1408691165499:
> regionserver:60020-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea
> regionserver:60020-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea-0x347fc15265a00ea
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:21,001 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2059.directvps.nl,60020,1408691165851 reported a fatal
> error:
> ABORTING region server vps2059.directvps.nl,60020,1408691165851:
> regionserver:60020-0x147fc1616d200bb-0x147fc1616d200bb-0x147fc1616d200bb
> regionserver:60020-0x147fc1616d200bb-0x147fc1616d200bb-0x147fc1616d200bb
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:21,056 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2058.directvps.nl,60020,1408691165675 reported a fatal
> error:
> ABORTING region server vps2058.directvps.nl,60020,1408691165675:
> regionserver:60020-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec
> regionserver:60020-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec-0x347fc15265a00ec
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:22,140 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> 2014-08-23 07:00:26,141 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> 2014-08-23 07:00:34,114 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2056.directvps.nl,60020,1408691165439 reported a fatal
> error:
> ABORTING region server vps2056.directvps.nl,60020,1408691165439:
> Unexpected exception handling nodeDeleted event
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/master
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:420)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:182)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:34,118 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region server vps2056.directvps.nl,60020,1408691165439 reported a fatal
> error:
> ABORTING region server vps2056.directvps.nl,60020,1408691165439:
> regionserver:60020-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2
> regionserver:60020-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2-0x247fc16c80500d2
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
>
> 2014-08-23 07:00:34,141 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> 2014-08-23 07:00:34,142 ERROR
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper getData
> failed after 3 retries
> 2014-08-23 07:00:34,152 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
> master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001
> Unable to get data of znode /hbase/unassigned/70236052
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
> at
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
> at
> org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
> at
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2014-08-23 07:00:34,152 ERROR
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
> master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001
> Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
> at
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
> at
> org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
> at
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2014-08-23 07:00:34,163 FATAL org.apache.hadoop.hbase.master.HMaster:
> Master server abort: loaded coprocessors are: []
> 2014-08-23 07:00:34,215 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
> /hbase/backup-masters/vps2008.directvps.nl,60000,1408691163492 already
> deleted, and this is not a retry
> 2014-08-23 07:05:34,165 WARN
> org.apache.hadoop.hbase.master.SplitLogManager: Interrupted while waiting
> for log splits to be completed
> 2014-08-23 07:05:34,179 FATAL org.apache.hadoop.hbase.master.HMaster:
> Unexpected ZK exception reading unassigned node for region=70236052
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/unassigned/70236052
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:709)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:685)
> at
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:852)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRegion(AssignmentManager.java:3274)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.isCarryingRoot(AssignmentManager.java:3255)
> at
> org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:382)
> at
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker.nodeDeleted(RegionServerTracker.java:122)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:318)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2014-08-23 07:05:34,179 WARN
> org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
> in [hdfs://
> namenode.openindex.io:8020/hbase/.logs/vps2058.directvps.nl,60020,1408691165675-splitting]
> installed = 2 but only 0 done
> 2014-08-23 07:05:34,184 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60000
> 2014-08-23 07:05:34,185 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor: Failed scan of catalog table
> java.io.IOException: Giving up after tries=1
> at
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:210)
> at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:188)
> at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:82)
> at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:67)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:126)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:137)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:93)
> at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:207)
> ... 8 more
> 2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60000: exiting
> 2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60000: exiting
> 2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 0 on 60000: exiting
> 2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60000: exiting
> 2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60000: exiting
> 2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 60000: exiting
> 2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60000: exiting
> 2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60000: exiting
> 2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60000: exiting
> 2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 2 on 60000: exiting
> 2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server listener on 60000
> 2014-08-23 07:05:34,213 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder
> 2014-08-23 07:05:34,214 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder
> 2014-08-23 07:05:34,212 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 1 on 60000: exiting
> 2014-08-23 07:05:34,186 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60000: exiting
> 2014-08-23 07:05:34,185 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60000: exiting
> 2014-08-23 07:05:34,256 INFO org.mortbay.log: Stopped
> SelectChannelConnector@0.0.0.0:60010
> 2014-08-23 07:05:34,259 FATAL org.apache.hadoop.hbase.master.HMaster:
> Master server abort: loaded coprocessors are: []
> 2014-08-23 07:05:34,260 FATAL org.apache.hadoop.hbase.master.HMaster:
> master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001-0x347fcb5a0130001
> master:60000-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x147fc1616d200ba-0x347fcb5a0130000-0x247ffe833880001-0x247ffe833880001-0x247ffe833880001-0x347fcb5a0130001
> received expired from ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2014-08-23 07:05:34,414 ERROR
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: HMaster Aborted
> at
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:160)
> at
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2129)
>
> GC log:
> =======
>
> 0.185: Application time: 0.1304320 seconds
> 0.185: [GC0.185: [ParNew: 4288K->511K(4800K), 0.0120520 secs]
> 4288K->1008K(15424K), 0.0121600 secs] [Times: user=0.01 sys=0.01, real=0.01
> secs]
> 0.197: Total time for which application threads were stopped: 0.0126240
> seconds
> Heap
> par new generation total 4800K, used 3580K [0x00000000b7200000,
> 0x00000000b7730000, 0x00000000c1860000)
> eden space 4288K, 71% used [0x00000000b7200000, 0x00000000b74ff328,
> 0x00000000b7630000)
> from space 512K, 99% used [0x00000000b76b0000, 0x00000000b772fff8,
> 0x00000000b7730000)
> to space 512K, 0% used [0x00000000b7630000, 0x00000000b7630000,
> 0x00000000b76b0000)
> concurrent mark-sweep generation total 10624K, used 496K
> [0x00000000c1860000, 0x00000000c22c0000, 0x00000000f5a00000)
> concurrent-mark-sweep perm gen total 21248K, used 6688K
> [0x00000000f5a00000, 0x00000000f6ec0000, 0x0000000100000000)
> 0.370: Application time: 0.1728650 seconds
>
>