You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/11/30 03:05:35 UTC

[jira] [Created] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

Ted Yu created HBASE-10059:
------------------------------

             Summary: TestSplitLogWorker#testMultipleTasks fails occasionally
                 Key: HBASE-10059
                 URL: https://issues.apache.org/jira/browse/HBASE-10059
             Project: HBase
          Issue Type: Test
            Reporter: Ted Yu


>From https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/ :
{code}
2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, OpenFileDescriptor=157, MaxFileDescriptor=40000, SystemLoadAverage=338, ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK server on client port: 53800
2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] zookeeper.RecoverableZooKeeper(120): Process identifier=split-log-worker-tests connecting to ZooKeeper ensemble=localhost:53800
2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, state=SyncConnected, path=null
2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a69133500000 connected
2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] regionserver.TestSplitLogWorker(105): /hbase created
2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): Starting executor service name=RS_LOG_REPLAY_OPS-TestSplitLogWorker, corePoolSize=10, maxPoolSize=10
2013-11-30 01:13:23,278 INFO  [pool-1-thread-1] regionserver.TestSplitLogWorker(246): testMultipleTasks
2013-11-30 01:13:23,280 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(175): SplitLogWorker tmt_svr,1,1 starting
2013-11-30 01:13:23,380 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(595): tasks arrived or departed
2013-11-30 01:13:23,394 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
2013-11-30 01:13:23,402 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task /hbase/splitWAL/tmt_task
2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(595): tasks arrived or departed
2013-11-30 01:13:23,418 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/splitWAL/tmt_task
2013-11-30 01:13:23,419 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(522): task /hbase/splitWAL/tmt_task preempted from tmt_svr,1,1, current task state and owner=OWNED another-worker,1,1
2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
2013-11-30 01:13:23,420 WARN  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(374): Interrupted while yielding for other region servers
java.lang.InterruptedException: sleep interrupted
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:372)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:251)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:209)
	at java.lang.Thread.run(Thread.java:662)
2013-11-30 01:13:23,427 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task /hbase/splitWAL/tmt_task_2
2013-11-30 01:13:24,331 DEBUG [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(253): Current region server tmt_svr,1,1 has 2 tasks in progress and can't take more.
2013-11-30 01:13:24,921 INFO  [pool-1-thread-1] regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
2013-11-30 01:13:24,921 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(299): SplitLogWorker interrupted while waiting for task, exiting: java.lang.InterruptedException
2013-11-30 01:13:24,921 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(216): SplitLogWorker tmt_svr,1,1 exiting
2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] regionserver.SplitLogWorker(426): Interrupted while trying to assert ownership of /hbase/splitWAL/tmt_task java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
	at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1264)
	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:407)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.attemptToOwnTask(SplitLogWorker.java:406)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$2.progress(SplitLogWorker.java:474)
	at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker$2.exec(TestSplitLogWorker.java:135)
	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
	at java.lang.Thread.run(Thread.java:662)

2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] regionserver.SplitLogWorker(426): Interrupted while trying to assert ownership of /hbase/splitWAL/tmt_task_2 java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
	at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1264)
	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:407)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.attemptToOwnTask(SplitLogWorker.java:406)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$2.progress(SplitLogWorker.java:474)
	at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker$2.exec(TestSplitLogWorker.java:135)
	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
	at java.lang.Thread.run(Thread.java:662)

2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] regionserver.SplitLogWorker$2(477): Failed to heartbeat the task/hbase/splitWAL/tmt_task
2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] regionserver.SplitLogWorker$2(477): Failed to heartbeat the task/hbase/splitWAL/tmt_task_2
2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] handler.HLogSplitterHandler(87): task execution prempted tmt_task
2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] handler.HLogSplitterHandler(87): task execution prempted tmt_task_2
2013-11-30 01:13:24,923 INFO  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] handler.HLogSplitterHandler(107): worker tmt_svr,1,1 done with task /hbase/splitWAL/tmt_task in 1520ms
2013-11-30 01:13:24,924 INFO  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] handler.HLogSplitterHandler(107): worker tmt_svr,1,1 done with task /hbase/splitWAL/tmt_task_2 in 1497ms
2013-11-30 01:13:24,951 ERROR [SyncThread:0] server.NIOServerCnxn(180): Unexpected Exception: 
java.nio.channels.CancelledKeyException
	at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
	at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
	at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
	at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
	at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
	at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:167)
	at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
2013-11-30 01:13:24,952 INFO  [pool-1-thread-1] zookeeper.MiniZooKeeperCluster(249): Shutdown MiniZK cluster with all ZK servers
2013-11-30 01:13:24,966 INFO  [pool-1-thread-1] hbase.ResourceChecker(171): after: regionserver.TestSplitLogWorker#testMultipleTasks Thread=18 (was 16)
Potentially hanging thread: pool-1-thread-1-SendThread(localhost:53800)
	java.lang.Thread.sleep(Native Method)
	org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:219)
	org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
	org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
 - Thread LEAK? -, OpenFileDescriptor=163 (was 157) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=327 (was 338), 
{code}
'Interrupted while trying to assert ownership' doesn't show up in a green run.



--
This message was sent by Atlassian JIRA
(v6.1#6144)