You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jeffrey Zhong (JIRA)" <ji...@apache.org> on 2013/12/07 01:52:35 UTC

[jira] [Updated] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

     [ https://issues.apache.org/jira/browse/HBASE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeffrey Zhong updated HBASE-10059:
----------------------------------

    Attachment: hbase-10059.patch

Increased the timeout from 1.5 secs to 5 secs.

> TestSplitLogWorker#testMultipleTasks fails occasionally
> -------------------------------------------------------
>
>                 Key: HBASE-10059
>                 URL: https://issues.apache.org/jira/browse/HBASE-10059
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Jeffrey Zhong
>         Attachments: hbase-10059.patch
>
>
> From https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/ :
> {code}
> 2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, OpenFileDescriptor=157, MaxFileDescriptor=40000, SystemLoadAverage=338, ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
> 2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK server on client port: 53800
> 2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] zookeeper.RecoverableZooKeeper(120): Process identifier=split-log-worker-tests connecting to ZooKeeper ensemble=localhost:53800
> 2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, state=SyncConnected, path=null
> 2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a69133500000 connected
> 2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] regionserver.TestSplitLogWorker(105): /hbase created
> 2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
> 2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): Starting executor service name=RS_LOG_REPLAY_OPS-TestSplitLogWorker, corePoolSize=10, maxPoolSize=10
> 2013-11-30 01:13:23,278 INFO  [pool-1-thread-1] regionserver.TestSplitLogWorker(246): testMultipleTasks
> 2013-11-30 01:13:23,280 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(175): SplitLogWorker tmt_svr,1,1 starting
> 2013-11-30 01:13:23,380 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,394 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,402 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task /hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,418 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a69133500000, quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,419 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(522): task /hbase/splitWAL/tmt_task preempted from tmt_svr,1,1, current task state and owner=OWNED another-worker,1,1
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
> 2013-11-30 01:13:23,420 WARN  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(374): Interrupted while yielding for other region servers
> java.lang.InterruptedException: sleep interrupted
> 	at java.lang.Thread.sleep(Native Method)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:372)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:251)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:209)
> 	at java.lang.Thread.run(Thread.java:662)
> 2013-11-30 01:13:23,427 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task /hbase/splitWAL/tmt_task_2
> 2013-11-30 01:13:24,331 DEBUG [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(253): Current region server tmt_svr,1,1 has 2 tasks in progress and can't take more.
> 2013-11-30 01:13:24,921 INFO  [pool-1-thread-1] regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
> 2013-11-30 01:13:24,921 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(299): SplitLogWorker interrupted while waiting for task, exiting: java.lang.InterruptedException
> 2013-11-30 01:13:24,921 INFO  [SplitLogWorker-tmt_svr,1,1] regionserver.SplitLogWorker(216): SplitLogWorker tmt_svr,1,1 exiting
> 2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] regionserver.SplitLogWorker(426): Interrupted while trying to assert ownership of /hbase/splitWAL/tmt_task java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
> 	at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1264)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:407)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.attemptToOwnTask(SplitLogWorker.java:406)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$2.progress(SplitLogWorker.java:474)
> 	at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker$2.exec(TestSplitLogWorker.java:135)
> 	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at java.lang.Thread.run(Thread.java:662)
> 2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] regionserver.SplitLogWorker(426): Interrupted while trying to assert ownership of /hbase/splitWAL/tmt_task_2 java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
> 	at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1264)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:407)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker.attemptToOwnTask(SplitLogWorker.java:406)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$2.progress(SplitLogWorker.java:474)
> 	at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker$2.exec(TestSplitLogWorker.java:135)
> 	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at java.lang.Thread.run(Thread.java:662)
> 2013-11-30 01:13:24,922 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] regionserver.SplitLogWorker$2(477): Failed to heartbeat the task/hbase/splitWAL/tmt_task
> 2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] regionserver.SplitLogWorker$2(477): Failed to heartbeat the task/hbase/splitWAL/tmt_task_2
> 2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] handler.HLogSplitterHandler(87): task execution prempted tmt_task
> 2013-11-30 01:13:24,923 WARN  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] handler.HLogSplitterHandler(87): task execution prempted tmt_task_2
> 2013-11-30 01:13:24,923 INFO  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-0] handler.HLogSplitterHandler(107): worker tmt_svr,1,1 done with task /hbase/splitWAL/tmt_task in 1520ms
> 2013-11-30 01:13:24,924 INFO  [RS_LOG_REPLAY_OPS-TestSplitLogWorker-1] handler.HLogSplitterHandler(107): worker tmt_svr,1,1 done with task /hbase/splitWAL/tmt_task_2 in 1497ms
> 2013-11-30 01:13:24,951 ERROR [SyncThread:0] server.NIOServerCnxn(180): Unexpected Exception: 
> java.nio.channels.CancelledKeyException
> 	at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> 	at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> 	at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
> 	at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
> 	at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
> 	at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:167)
> 	at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
> 2013-11-30 01:13:24,952 INFO  [pool-1-thread-1] zookeeper.MiniZooKeeperCluster(249): Shutdown MiniZK cluster with all ZK servers
> 2013-11-30 01:13:24,966 INFO  [pool-1-thread-1] hbase.ResourceChecker(171): after: regionserver.TestSplitLogWorker#testMultipleTasks Thread=18 (was 16)
> Potentially hanging thread: pool-1-thread-1-SendThread(localhost:53800)
> 	java.lang.Thread.sleep(Native Method)
> 	org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:219)
> 	org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
> 	org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
>  - Thread LEAK? -, OpenFileDescriptor=163 (was 157) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=327 (was 338), 
> {code}
> 'Interrupted while trying to assert ownership' doesn't show up in a green run.



--
This message was sent by Atlassian JIRA
(v6.1#6144)