You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Baiqiang Zhao (Jira)" <ji...@apache.org> on 2019/12/06 07:20:00 UTC

[jira] [Commented] (HBASE-23375) NPE during opening a daughter region in cacheBlock

    [ https://issues.apache.org/jira/browse/HBASE-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989470#comment-16989470 ] 

Baiqiang Zhao commented on HBASE-23375:
---------------------------------------

The trigger may be:

(1) The top reference file will getFirstKey when opening, and cache miss, then read block from HFile and cache the block into BucketCache.

(2) It found BC already contains cacheKey in method cacheBlockWithWait(), and the existingBlock is in ramCache. It's possible that both the daughter regions load the same block from their parent HFile.

(3) So go to method shouldReplaceExistingCacheBlock(). At the same time, the existingBlock is added to writerQueue and remove from ramCache. So in shouldReplaceExistingCacheBlock() it will get null when get existingBlock from BC.

(4)Finally, throws a NPE, and RS going down. 

Anything can happen with multi-thread environment.

> NPE during opening a daughter region in cacheBlock 
> ---------------------------------------------------
>
>                 Key: HBASE-23375
>                 URL: https://issues.apache.org/jira/browse/HBASE-23375
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.6.0, 1.4.11
>            Reporter: Baiqiang Zhao
>            Priority: Major
>
> The RegionServer log is :
> {code:java}
> 2019-12-04 11:32:37,238 INFO  [regionserver/localhost/0.0.0.0:16020-splits-0] regionserver.SplitRequest: Running rollback/cleanup of failed split of ONLINE:testTable,\x00999999999\x0014aa9,1575406565984.48f462e65b7961420737797c2ccf76c9.; Failed localhost,16020,1574999150042-daughterOpener=aad203e7b1aa26a26b50c84f70397456
> java.io.IOException: Failed localhost,16020,1574999150042-daughterOpener=aad203e7b1aa26a26b50c84f70397456
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.openDaughters(SplitTransactionImpl.java:504)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsAfterPONR(SplitTransactionImpl.java:598)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:581)
>         at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
>         at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:153)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.io.IOException: java.lang.NullPointerException
>         at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1041)
>         at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:916)
>         at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:884)
>         at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7098)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.openDaughterRegion(SplitTransactionImpl.java:732)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl$DaughterOpener.run(SplitTransactionImpl.java:712)
>        ... 1 more
> Caused by: java.io.IOException: java.lang.NullPointerException
>         at org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:577)
>         at org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:532)
>         at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:281)
>         at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5469)
>         at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1015)
>         at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1012)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         ... 1 more
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.compareCacheBlock(BlockCacheUtil.java:185)
>         at org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:204)
>         at org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:233)
>         at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:433)
>         at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:419)
>         at org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:68)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:462)
>         at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:269)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:651)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:601)
>         at org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:190)
>         at org.apache.hadoop.hbase.io.HalfStoreFileReader.getFirstKey(HalfStoreFileReader.java:365)
>         at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:546)
>         at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:563)
>         at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:553)
>         at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:707)
>         at org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:122)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:552)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:549)
>         ... 6 more
> 2019-12-04 11:32:37,288 WARN  [regionserver/localhost/0.0.0.0:16020-splits-0] regionserver.SplitTransaction: Should use rollback(Server, RegionServerServices, User)
> 2019-12-04 11:32:37,294 FATAL [regionserver/localhost/0.0.0.0:16020-splits-0] regionserver.HRegionServer: ABORTING region server localhost,16020,1574999150042: Abort; we got an error after point-of-no-return{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)