You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (JIRA)" <ji...@apache.org> on 2019/06/04 13:18:01 UTC
[jira] [Commented] (HBASE-22539) Potential WAL corruption due to Unsafe.copyMemory usage when DBB are in place

    [ https://issues.apache.org/jira/browse/HBASE-22539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855688#comment-16855688 ] 

Wellington Chevreuil commented on HBASE-22539:
----------------------------------------------

Symptoms:

Upon an RS crash that required log split/replay, RSes splitting WALs would eventually fail and crash with the following stack trace error:
{noformat}
ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY
java.lang.ArrayIndexOutOfBoundsException: 18056
        at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
        at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
        at org.apache.hadoop.hbase.PrivateCellUtil.matchingFamily(PrivateCellUtil.java:735)
        at org.apache.hadoop.hbase.CellUtil.matchingFamily(CellUtil.java:816)
        at org.apache.hadoop.hbase.wal.WALEdit.isMetaEditFamily(WALEdit.java:143)
        at org.apache.hadoop.hbase.wal.WALEdit.isMetaEdit(WALEdit.java:148)
        at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:297)
        at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:195)
        at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:100)
{noformat}
 Trying to read given wal file with WalPP also gets:
{noformat}
Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
Exception in thread "main" java.lang.IllegalArgumentException: offset (537087) + length (8) exceed the capacity of the array: 1066
	at org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:830)
	at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:813)
	at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:798)
	at org.apache.hadoop.hbase.KeyValue.getTimestamp(KeyValue.java:1488)
	at org.apache.hadoop.hbase.KeyValue.getTimestamp(KeyValue.java:1479)
	at org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:359)
	at org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:305)
	at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:444)
{noformat}
After applying the changes committed on [my fork branch HBASE-22539|https://github.com/wchevreuil/hbase/tree/HBASE-22539], the following exception logged on this cluster led to conclude the corruption happens when copying from BB to array [here|https://github.com/wchevreuil/hbase/blob/HBASE-22539/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferWriterOutputStream.java#L92]:
{noformat}
2019-05-30 13:10:26,240 WARN org.apache.hadoop.hbase.KeyValueUtil: Got a KV validation error while writing. Just logging it for now and allowing to continue:
java.lang.IllegalArgumentException: Invalid key length in KeyValue. keyLength=40723827, KeyValueBytesHex=...
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
...
00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, offset=0, length=2098
        at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:555)
        at org.apache.hadoop.hbase.KeyValueUtil.isBufferValid(KeyValueUtil.java:532)
        at org.apache.hadoop.hbase.io.ByteBufferWriterOutputStream.write(ByteBufferWriterOutputStream.java:99)
        at org.apache.hadoop.hbase.util.ByteBufferUtils.copyBufferToStream(ByteBufferUtils.java:451)
        at org.apache.hadoop.hbase.ByteBufferKeyValue.write(ByteBufferKeyValue.java:277)
        at org.apache.hadoop.hbase.KeyValueUtil.oswrite(KeyValueUtil.java:794)
        at org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$EnsureKvEncoder.write(WALCellCodec.java:382)
        at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.append(ProtobufLogWriter.java:54)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog.doAppend(FSHLog.java:302)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog.doAppend(FSHLog.java:67)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.append(AbstractFSWAL.java:918)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1082)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:973)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:881)
        at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129)
        at java.lang.Thread.run(Thread.java:748)
2019-05-30 13:10:26,293 WARN org.apache.hadoop.hbase.io.ByteBufferWriterOutputStream: DBB copy via unsafe is corrupt.
         DBB offset: 155
         bytes to copy: 2098
 Will check normal copy next... 
{noformat}
 

Note the check fails [here|https://github.com/wchevreuil/hbase/blob/HBASE-22539/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferWriterOutputStream.java#L99], where it's validating this *testBuf* byte[], which is an exact copy of the resulting array from *ByteBufferUtils.copyFromBufferToArray(this.buf, b, off + totalCopied, 0, bytesToCopy);* call [here|https://github.com/wchevreuil/hbase/blob/HBASE-22539/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferWriterOutputStream.java#L92].

When the check from line #99 fails (printing the first WARN with IAE and stack trace), it copies again from DBB to this test byte[], but instead of using *Unsafe.copyMemory*, it uses *ByteBuffer.get* method, and check the resulting byte[] on line #106, but this second validation passes. This leads to conclude the error happens on this *ByteBufferUtils.copyFromBufferToArray(this.buf, b, off + totalCopied, 0, bytesToCopy);* call, which uses Unsafe.copyMemory underneath it.

I can't see anything suspicious, however, on current implementation for *UnsafeCopy.unsafeCopy* (apart from minor improvement reported on HBASE-22496, which shouldn't be causing this corruption). A remote possibility could be due to thread contention, as I noticed Compaction executions happening at same time, and those would also rely on on *UnsafeCopy.unsafeCopy* calls, but that's a high guess, would think a bit unusual for this *UnsafeAccess.copyMemory* not be thread safe. Could it be an specific issue on this *UnsafeAccess.copyMemory* method? ** Maybe folks with better understanding of such implementation could share some useful insights here.

Additional observations:

1) The main workaround for now is to complete disable the usage of DirectByteBuffer and the need for unsafe copies by switching *hbase.rpc.server.impl* back to SimpleRpcServer. After applying this change and running for some weeks, no corruption has been observed.

2) RSes on the given cluster have been running on hosts suffering memory contention, including swapping has been observed. Ain't sure if that would be relevant for unsafe copy.

3) Prior to switching rpc server implementation, we had also suspected on AsyncFSWal, but switching to FSHLog didn't prevent given corruptions from re-occur. Those were only gone after the change applied on #1.

Special thanks to [~psomogyi], [~busbey], [~elserj] for the continuing assistance with this investigation.

 

> Potential WAL corruption due to Unsafe.copyMemory usage when DBB are in place
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-22539
>                 URL: https://issues.apache.org/jira/browse/HBASE-22539
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.1.1
>            Reporter: Wellington Chevreuil
>            Priority: Blocker
>
> Summary
> We had been chasing a WAL corruption issue reported on one of our customers deployments running release 2.1.1 (CDH 6.1.0). After providing a custom modified jar with the extra sanity checks implemented by HBASE-21401 applied on some code points, plus additional debugging messages, we believe it is related to DirectByteBuffer usage, and Unsafe copy from offheap memory to on-heap array triggered [here|https://github.com/apache/hbase/blob/branch-2.1/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java#L1157], such as when writing into a non ByteBufferWriter type, as done [here|https://github.com/apache/hbase/blob/branch-2.1/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferWriterOutputStream.java#L84].
> More details on the following comment.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)