You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2021/05/19 17:24:00 UTC

[jira] [Comment Edited] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy

    [ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347793#comment-17347793 ] 

Andrew Kyle Purtell edited comment on HBASE-24623 at 5/19/21, 5:23 PM:
-----------------------------------------------------------------------

{quote}There was a jira that [~andrew.purtell@gmail.com] did for turning off the usage of Unsafe.
{quote}
We did this for the client side. The thought was our application server, running on Java 11, embedding the HBase client, did not need to use Unsafe there, so Unsafe was an unnecessary risk (for our application). Well, turns out I was wrong, even on the client side we need Unsafe for performance. As soon as we tried it we were dinged for a significant performance regression.

Turning off Unsafe on the server would be a nonstarter, in terms of performance loss.


was (Author: apurtell):
{quote}There was a jira that [~andrew.purtell@gmail.com] did for turning off the usage of Unsafe.
{quote}
We did this for the client side. The thought was our application server, running on Java 11, embedding the HBase client did not need to use Unsafe there, so Unsafe was an unnecessary risk. Well, turns out I was wrong, even on the client side we need Unsafe for performance. As soon as we tried it we were dinged for a significant performance regression.

Turning off Unsafe on the server would be a nonstarter, in terms of performance loss.

> SIGSEGV v  ~StubRoutines::jbyte_disjoint_arraycopy
> --------------------------------------------------
>
>                 Key: HBASE-24623
>                 URL: https://issues.apache.org/jira/browse/HBASE-24623
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Michael Stack
>            Priority: Major
>
> In testing, 1% of a decent cluster went down with this seg fault in the vm:
> {code}
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f6659052410, pid=37208, tid=0x00007f3c89453700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09)
> # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 )
> # Problematic frame:
> # v  ~StubRoutines::jbyte_disjoint_arraycopy
> {code}
> Looking in the hs_err log, the crash happens in the same area. Here are a few of the stack traces:
> {code}
> Stack: [0x00007f3c89353000,0x00007f3c89454000],  sp=0x00007f3c89452110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> v  ~StubRoutines::jbyte_disjoint_arraycopy
> J 17674 C2 org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V (69 bytes) @ 0x00007f665af000d1 [0x00007f665aefffe0+0xf1]
> J 17732 C1 org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I (59 bytes) @ 0x00007f665bc440dc [0x00007f665bc43b80+0x55c]
> j  org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12
> J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B (5 bytes) @ 0x00007f6659bd4784 [0x00007f6659bd4760+0x24]
> j  org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97
> j  org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6
> j  org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16
> j  org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2
> j  org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28
> J 22605 C2 org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; (8 bytes) @ 0x00007f665a982a04 [0x00007f665a9829e0+0x24]
> J 22112 C2 org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; (910 bytes) @ 0x00007f665c706700 [0x00007f665c706000+0x700]
> J 24084 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V (646 bytes) @ 0x00007f665cc21100 [0x00007f665cc20c80+0x480]
> J 14696 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; (901 bytes) @ 0x00007f665b722148 [0x00007f665b7218e0+0x868]
> {code}
> Here's another:
> {code}
> Stack: [0x00007edd015e2000,0x00007edd016e3000],  sp=0x00007edd016e11b0,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> v  ~StubRoutines::jbyte_disjoint_arraycopy
> J 18255 C2 org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V (69 bytes) @ 0x00007f06d2593551 [0x00007f06d2593460+0xf1]
> j  org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31
> j  org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12
> j  org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1
> j  org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+40
> j  org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2
> j  org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28
> J 24361 C2 org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; (8 bytes) @ 0x00007f06d1c04d04 [0x00007f06d1c04ce0+0x24]
> J 24273 C2 org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; (910 bytes) @ 0x00007f06d4de48b4 [0x00007f06d4de40e0+0x7d4]
> ...
> {code}
> And hereā€¦
> {code}
> Stack: [0x00007f63d89ba000,0x00007f63d8abb000],  sp=0x00007f63d8ab9170,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> v  ~StubRoutines::jbyte_disjoint_arraycopy
> J 22303 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B (5 bytes) @ 0x00007f8dac8dc067 [0x00007f8dac8dbae0+0x587]
> j  org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97
> j  org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6
> j  org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16
> j  org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2
> j  org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28
> j  org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put;+2
> ....
> {code}
> Its this bit of code....in Mutation...processing a large multi request:
> {code}
>   Mutation add(Cell cell) throws IOException {
>     //Checking that the row of the kv is the same as the mutation
>     // TODO: It is fraught with risk if user pass the wrong row.
>     // Throwing the IllegalArgumentException is more suitable I'd say.
>     if (!CellUtil.matchingRows(cell, this.row)) {
>       throw new WrongRowIOException("The row in " + cell.toString() +
>         " doesn't match the original one " +  Bytes.toStringBinary(this.row));
>     }
> ...
> {code}
> Its the call to 'cell.toString()' seemingly each time.
> Oh, I can't reproduce at least with basic messing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)