You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Konstantinos Kougios <ko...@googlemail.com> on 2015/09/30 14:47:08 UTC
aggregate query makes all region servers crash
I have 3xregion servers, 8GB mem each, and running this query via
sqlline.py:
select count(*),word from words group by word limit 10;
So far 3 region servers died, the 1st one with no error in the log, the
second one with this (some race condition with an other region server?
as I have been restarting the 1st crashed server):
2015-09-30 13:26:45,429 INFO [RS_OPEN_REGION-d1:16020-1]
coordination.ZkOpenRegionCoordination: Opening of region {ENCODED =>
e211961cd190cf57f8c5a691bd3f265f, NAME =>
'PERFORMANCE_1000,EUSalesforce\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1442843941238.e211961cd190cf57f8c5a691bd3f265f.',
STARTKEY => 'EUSalesforce\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
ENDKEY => 'NAApple\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'} failed,
transitioning from OFFLINE to FAILED_OPEN in ZK, expecting version 2
2015-09-30 13:26:47,786 INFO
[regionserver/d1.lan/192.168.0.29:16020.logRoller]
regionserver.LogRoller: LogRoller exiting.
2015-09-30 13:26:47,787 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.CompactSplitThread: Waiting for Split Thread to finish...
2015-09-30 13:26:47,787 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.CompactSplitThread: Waiting for Merge Thread to finish...
2015-09-30 13:26:47,787 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.CompactSplitThread: Waiting for Large Compaction Thread to
finish...
2015-09-30 13:26:47,787 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.CompactSplitThread: Waiting for Small Compaction Thread to
finish...
2015-09-30 13:26:48,282 INFO [regionserver/d1.lan/192.168.0.29:16020]
client.ConnectionManager$HConnectionImplementation: Closing zookeeper
sessionid=0x1501e1145c90002
2015-09-30 13:26:48,299 INFO [regionserver/d1.lan/192.168.0.29:16020]
zookeeper.ZooKeeper: Session: 0x1501e1145c90002 closed
2015-09-30 13:26:48,299 INFO
[regionserver/d1.lan/192.168.0.29:16020-EventThread]
zookeeper.ClientCnxn: EventThread shut down
2015-09-30 13:26:48,300 INFO [regionserver/d1.lan/192.168.0.29:16020]
ipc.RpcServer: Stopping server on 16020
2015-09-30 13:26:48,300 INFO [RpcServer.listener,port=16020]
ipc.RpcServer: RpcServer.listener,port=16020: stopping
2015-09-30 13:26:48,301 INFO [RpcServer.responder] ipc.RpcServer:
RpcServer.responder: stopped
2015-09-30 13:26:48,335 INFO [RpcServer.responder] ipc.RpcServer:
RpcServer.responder: stopping
2015-09-30 13:26:48,387 INFO [regionserver/d1.lan/192.168.0.29:16020]
zookeeper.ZooKeeper: Session: 0x1501e1145c90000 closed
2015-09-30 13:26:48,387 INFO [main-EventThread] zookeeper.ClientCnxn:
EventThread shut down
2015-09-30 13:26:48,387 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.HRegionServer: stopping server d1.lan,16020,1443613463226;
zookeeper connection closed.
2015-09-30 13:26:48,387 INFO [regionserver/d1.lan/192.168.0.29:16020]
regionserver.HRegionServer: regionserver/d1.lan/192.168.0.29:16020 exiting
2015-09-30 13:26:48,388 ERROR [main]
regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: HRegionServer Aborted
at
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68)
at
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2651)
2015-09-30 13:26:48,390 INFO [Thread-6] regionserver.ShutdownHook:
Shutdown hook starting; hbase.shutdown.hook=true;
fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@31dadd46
2015-09-30 13:26:48,390 INFO [Thread-6] regionserver.ShutdownHook:
Starting fs shutdown hook thread.
2015-09-30 13:26:48,391 INFO [Thread-6] regionserver.ShutdownHook:
Shutdown hook finished.
I am keeping an eye on the region servers via jmx and they didn't seem
to have any memory pressure.
sqlline exceptions:
15/09/30 12:38:56 ERROR zookeeper.ZooKeeperWatcher:
hconnection-0x358c99f5-0x501df0e3cf000f, quorum=nn.lan:2181,
baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/meta-region-server
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:360)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:745)
at
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:482)
at
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:168)
at
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:600)
at
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580)
at
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559)
at
org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152)
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
at
org.apache.hadoop.hbase.client.StatsTrackingRpcRetryingCaller.callWithoutRetries(StatsTrackingRpcRetryingCaller.java:56)
at
org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:211)
at
org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:185)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1249)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
at
org.apache.hadoop.hbase.client.StatsTrackingRpcRetryingCaller.callWithoutRetries(StatsTrackingRpcRetryingCaller.java:56)
at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
at
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:809)
at
org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:67)
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:88)
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:79)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:105)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)