You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Xiaolin Ha (Jira)" <ji...@apache.org> on 2021/07/30 07:45:00 UTC
[jira] [Created] (HBASE-26155) JVM crash when rpc calls close scanner

Xiaolin Ha created HBASE-26155:
----------------------------------

             Summary: JVM crash when rpc calls close scanner
                 Key: HBASE-26155
                 URL: https://issues.apache.org/jira/browse/HBASE-26155
             Project: HBase
          Issue Type: Bug
          Components: Scanners
    Affects Versions: 3.0.0-alpha-1
            Reporter: Xiaolin Ha


There are scanner close caused regionserver JVM coredump problems on our production clusters.

{code:java}
Stack: [0x00007fca4b0cc000,0x00007fca4b1cd000],  sp=0x00007fca4b1cb0d8,  free space=1020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x7fd314]
J 2810  sun.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V (0 bytes) @ 0x00007fdae55a9e61 [0x00007fdae55a9d80+0xe1]
j  org.apache.hadoop.hbase.util.UnsafeAccess.unsafeCopy(Ljava/lang/Object;JLjava/lang/Object;JJ)V+36
j  org.apache.hadoop.hbase.util.UnsafeAccess.copy(Ljava/nio/ByteBuffer;I[BII)V+69
j  org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V+39
j  org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31
j  org.apache.hadoop.hbase.KeyValueUtil.appendKeyTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+43
J 14724 C2 org.apache.hadoop.hbase.regionserver.StoreScanner.shipped()V (51 bytes) @ 0x00007fdae6a298d0 [0x00007fdae6a29780+0x150]
J 21387 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices$RegionScannerShippedCallBack.run()V (53 bytes) @ 0x00007fdae622bab8 [0x00007fdae622acc0+0xdf8]
J 26353 C2 org.apache.hadoop.hbase.ipc.ServerCall.setResponse(Lorg/apache/hbase/thirdparty/com/google/protobuf/Message;Lorg/apache/hadoop/hbase/CellScanner;Ljava/lang/Throwable;Ljava/lang/String;)V (384 bytes) @ 0x00007fdae7f139d8 [0x00007fdae7f12980+0x1058]
J 26226 C2 org.apache.hadoop.hbase.ipc.CallRunner.run()V (1554 bytes) @ 0x00007fdae959f68c [0x00007fdae959e400+0x128c]
J 19598% C2 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(Ljava/util/concurrent/BlockingQueue;Ljava/util/concurrent/atomic/AtomicInteger;)V (338 bytes) @ 0x00007fdae81c54d4 [0x00007fdae81c53e0+0xf4]
{code}

There is no guarantee for RPC calls to hold unique scanners, right? 
For example, when there are client disconnect problems, RS may not terminate the scanner nexts until it checks the `rpcCall.disconnectSince()` time. But before this another scan RPC may also use the same scanner that holds in the RS cache by RegionScannerHolder. Then they change the `previousCell` in the scanner in different threads...













--
This message was sent by Atlassian Jira
(v8.3.4#803005)