You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2015/08/21 19:47:46 UTC
[jira] [Created] (HBASE-14284) In TRUNK, AsyncRpcClient does not
timeout; hangs TestDistributedLogReplay, etc.
stack created HBASE-14284:
-----------------------------
Summary: In TRUNK, AsyncRpcClient does not timeout; hangs TestDistributedLogReplay, etc.
Key: HBASE-14284
URL: https://issues.apache.org/jira/browse/HBASE-14284
Project: HBase
Issue Type: Bug
Reporter: stack
Assignee: stack
TestDistributedLogReplay puts up regionservers with *40* priority handlers each. This makes for TDLR running with many hundreds of threads. Trying to figure why 40, I see the test can hang if less with all client use stuck never timing out:
{code}
"RS:2;localhost:58498" prio=5 tid=0x00007fd284d4e800 nid=0x416af in Object.wait() [0x000000012952e000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:461)
at io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:355)
- locked <0x00000007dff93ea0> (a org.apache.hadoop.hbase.ipc.AsyncCall)
at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:266)
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:42)
at org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:231)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:214)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:288)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerReport(RegionServerStatusProtos.java:8994)
at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1148)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:957)
at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:279)
at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
at java.lang.Thread.run(Thread.java:744)
{code}
We never recover.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)