You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Raman Ch (JIRA)" <ji...@apache.org> on 2017/04/28 14:07:04 UTC

[jira] [Resolved] (HBASE-17901) HBase region server stops because of a failure during memstore flush

     [ https://issues.apache.org/jira/browse/HBASE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raman Ch resolved HBASE-17901.
------------------------------
    Resolution: Not A Problem

> HBase region server stops because of a failure during memstore flush
> --------------------------------------------------------------------
>
>                 Key: HBASE-17901
>                 URL: https://issues.apache.org/jira/browse/HBASE-17901
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.2.2
>         Environment: Ubuntu 14.04.5 LTS
> HBase Version	1.2.2, revision=1
> Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
> Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
>            Reporter: Raman Ch
>
> Once per several days region server fails to flush a memstore and stops.
> April, 8:
> {code}
> 2017-04-08 00:10:57,737 WARN  [MemStoreFlusher.1] regionserver.HStore: Failed flushing store file, retrying num=9
> java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column actually smaller than the previous column: 
> 	at org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:117)
> 	at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:464)
> 	at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:119)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$600(MemStoreFlusher.java:75)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:244)
> 	at java.lang.Thread.run(Thread.java:745)
> 2017-04-08 00:10:57,737 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING region server datanode13.webmeup.com,16020,1491573320653: Replay of WAL required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: di_ordinal_tmp,gov.ok.data/browse?page=2&category=Natural%20Resources&limitTo=datasets&tags=ed,1489764397211.9d7ca11018672c4aace7f30c8f4253f3.
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2428)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$600(MemStoreFlusher.java:75)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:244)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column actually smaller than the previous column: 
> 	at org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:117)
> 	at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:464)
> 	at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:119)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
> 	... 9 more
> {code}
> After region server restart it functioned properly for a couple of days.
> April, 10:
> {code}
> 2017-04-10 22:36:32,147 WARN  [MemStoreFlusher.0] regionserver.HStore: Failed flushing store file, retrying num=9
> java.io.IOException: Non-increasing Bloom keys: de.tina-eicke.blog/category/garten/\x09h after de.uina-eicke.blog/category/fruehling/\x09h
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.appendGeneralBloomfilter(StoreFile.java:936)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:969)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:125)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> 	at java.lang.Thread.run(Thread.java:745)
> 2017-04-10 22:36:32,147 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: ABORTING region server datanode13.webmeup.com,16020,1491828707088: Replay of WAL required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: di_ordinal_tmp,de.thschroeer/lmo/lmo.php?action=results&file=archiv/BLW2-2013.l98&endtab=8&st=8&tabtype=2\x09hw,1489764397211.b07eaba657affc2ba29f84b59c672836.
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2428)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Non-increasing Bloom keys: de.tina-eicke.blog/category/garten/\x09h after de.uina-eicke.blog/category/fruehling/\x09h
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.appendGeneralBloomfilter(StoreFile.java:936)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:969)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:125)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
> 	... 9 more 
> {code}
> April, 13
> {code}
> 2017-04-13 14:06:30,189 WARN  [MemStoreFlusher.1] regionserver.HStore: Failed flushing store file, retrying num=9
> java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column actually smaller than the previous column: p
>         at org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:117)
>         at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:464)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
>         at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:119)
>         at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
>         at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
>         at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>         at java.lang.Thread.run(Thread.java:745)
> 2017-04-13 14:06:30,190 INFO  [regionserver/datanode13.webmeup.com/88.99.58.169:16020-longCompactions-1491986731568] compress.CodecPool: Got brand-new decompressor [.gz]
> 2017-04-13 14:06:30,190 INFO  [regionserver/datanode13.webmeup.com/88.99.58.169:16020-shortCompactions-1491986746336] compress.CodecPool: Got brand-new decompressor [.gz]
> 2017-04-13 14:06:30,190 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING region server datanode13.webmeup.com,16020,1491986730362: Replay of WAL required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: di_ordinal_tmp,net.teleklik.lepavina/citati_izreke/index.php?page=2\x09h,1491054654848.ddb53de0e924251818ac2cb0c07b072f.
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2428)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2105)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2067)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1958)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1884)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column actually smaller than the previous column: p
>         at org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:117)
>         at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:464)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
>         at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:119)
>         at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
>         at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:915)
>         at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2271)
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2375)
>         ... 9 more
> 2017-04-13 14:06:30,190 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
> {code}
> Table description:
> {code}
> 'di_ordinal_tmp', {TABLE_ATTRIBUTES => {DURABILITY => 'ASYNC_WAL', MAX_FILESIZE => '8589934592'}, {NAME => 'di', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => '10368000 SECONDS (120 DAYS)', COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'false', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0', METADATA => {'COMPRESSION_COMPACT' => 'GZ'}}
> {code}
> The table is being populated only using put operations. There has never been any bulk loading into this table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)