You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Michael Stack (Jira)" <ji...@apache.org> on 2020/08/18 14:59:00 UTC

[jira] [Commented] (HBASE-24896) 'Stuck' creating RegionInfo instance

    [ https://issues.apache.org/jira/browse/HBASE-24896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179686#comment-17179686 ] 

Michael Stack commented on HBASE-24896:
---------------------------------------

Attached thread dumps. Odd is that all threads are RUNNABLE state, none BLOCKED. Also odd at the moment to me is that we are stuck getting from a Map whose key is a String but the thread dump shows us doing construction on an Interface (RegionInfo).

> 'Stuck' creating RegionInfo instance
> ------------------------------------
>
>                 Key: HBASE-24896
>                 URL: https://issues.apache.org/jira/browse/HBASE-24896
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.3.1
>            Reporter: Michael Stack
>            Priority: Major
>         Attachments: hbasedn192-jstack-0.webarchive, hbasedn192-jstack-1.webarchive, hbasedn192-jstack-2.webarchive
>
>
> We ran into the following deadlocked server in testing. The priority handlers seem stuck across multiple thread dumps. Seven of the ten total priority threads have this state:
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=5,queue=1,port=16020" #82 daemon prio=5 os_prio=0 cpu=0.70ms elapsed=315627.86s allocated=3744B defined_classes=0 tid=0x00007f3da0983040 nid=0x62d9 in Object.wait()  [0x00007f3d9bc8c000]
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3143)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3478)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44858)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) {code}
> The anomalous three are as follows:
> h3. #1
> {code:java}
> "RpcServer.priority.RWQ.Fifo.write.handler=0,queue=0,port=16020" #77 daemon prio=5 os_prio=0 cpu=175.98ms elapsed=315627.86s allocated=2153K defined_classes=14 tid=0x00007f3da0ae6ec0 nid=0x62d4 in Object.wait()  [0x00007f3d9c190000]
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.hadoop.hbase.client.RegionInfo.<clinit>(RegionInfo.java:72)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2912)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44856)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318){code}
> ...which is the creation of the UNDEFINED in RegionInfo here:
> {color:#808000}@InterfaceAudience.Public{color}{color:#000080}public interface {color}RegionInfo {color:#000080}extends {color}Comparable<RegionInfo> {
>  RegionInfo {color:#660e7a}UNDEFINED {color}= RegionInfoBuilder.newBuilder(TableName.valueOf({color:#008000}"__UNDEFINED__"{color})).build();
>  
> h3. #2
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=4,queue=1,port=16020" #81 daemon prio=5 os_prio=0 cpu=53.85ms elapsed=315627.86s allocated=81984B defined_classes=3 tid=0x00007f3da0981590 nid=0x62d8 in Object.wait()  [0x00007f3d9bd8c000]
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.hadoop.hbase.client.RegionInfoBuilder.<clinit>(RegionInfoBuilder.java:49)
> 	at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown Source)
> 	at java.util.ArrayList.forEach(java.base@11.0.6/ArrayList.java:1540)
> 	at java.util.Collections$UnmodifiableCollection.forEach(java.base@11.0.6/Collections.java:1085)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) {code}
> which is here creating meta MetaRegionInfo..
>  
> {color:#000080}public static final {color}RegionInfo {color:#660e7a}FIRST_META_REGIONINFO {color}=
>  {color:#000080}new {color}MutableRegionInfo({color:#0000ff}1L{color}, TableName.{color:#660e7a}META_TABLE_NAME{color}, RegionInfo.{color:#660e7a}DEFAULT_REPLICA_ID{color});
>  
> h3. #3
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=8,queue=1,port=16020" #85 daemon prio=5 os_prio=0 cpu=0.50ms elapsed=315627.85s allocated=1960B defined_classes=0 tid=0x00007f3da0d851d0 nid=0x62dc in Object.wait()  [0x00007f3d9b989000]
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown Source)
> 	at java.util.ArrayList.forEach(java.base@11.0.6/ArrayList.java:1540)
> 	at java.util.Collections$UnmodifiableCollection.forEach(java.base@11.0.6/Collections.java:1085)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
>  {code}
> ... which is here in code
> {color:#000080}if {color}(tableName.equals(TableName.{color:#660e7a}META_TABLE_NAME{color}) && replicaId == defaultReplicaId) {
>  {color:#000080}return {color}RegionInfoBuilder.{color:#660e7a}FIRST_META_REGIONINFO{color};
>  }
>  
> The thread dump does not seem to recognize the above as a deadlock.
>  
> ...at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327) is doing the below:
> {color:#000080}return this{color}.{color:#660e7a}onlineRegions{color}.get(encodedRegionName);
> ... where onlineRegions is concurrent Map of String to HRegion.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)