You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "zhuobin zheng (Jira)" <ji...@apache.org> on 2021/11/18 18:57:00 UTC
[jira] [Work started] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
[ https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HBASE-26467 started by zhuobin zheng.
---------------------------------------------
> Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
> ----------------------------------------------------------------------------------------------------------
>
> Key: HBASE-26467
> URL: https://issues.apache.org/jira/browse/HBASE-26467
> Project: HBase
> Issue Type: Bug
> Reporter: zhuobin zheng
> Assignee: zhuobin zheng
> Priority: Critical
>
> In our company 2.X cluster. I found some region compaction keeps failling because some cell can't construct succefully. In fact , we even can't read these cell.
> From follow stack , we can found the bug cause KeyValue can't constructed.
> Simple Log and Stack:
> {code:java}
> // code placeholder
> 2021-11-18 16:50:47,708 ERROR [regionserver/xxxx:60020-longCompactions-4] regionserver.CompactSplit: Compaction failed region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., storeName=c, priority=-319, startTime=1637225447127
> java.lang.IllegalArgumentException: Invalid tag length at position=4659867, tagLength=0,
> at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
> at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
> at org.apache.hadoop.hbase.KeyValue.<init>(KeyValue.java:345)
> at org.apache.hadoop.hbase.SizeCachedKeyValue.<init>(SizeCachedKeyValue.java:43)
> at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
> at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:322)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:288)
> at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
> at org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
> at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
> at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
> at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
> at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
> at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
> at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
> at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748) {code}
> From further observation, I found the following characteristics:
> # Cell size more than 2M
> # We can reproduce the bug only after in memory compact
> # Cell bytes end with \x00\x02\x00\x00
>
> In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) which only invoked when cell bigger than data chunk size construct cell with wrong length. So there are 4 bytes (chunk head size) append end of the cell bytes.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)