You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by 聪聪 <17...@qq.com> on 2016/05/31 02:03:17 UTC

Memstore blocking

HI ALL:       Recently,I met a strange problem,  the first Region’s Memstore of one table (the only one) often blocked when flushing.(Both Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to 1.0.1.1,hope to solve the problem,But failed)
       On the web UI, I can see the status shows:  ABORTED(since XXsec ago), Not flushing since already flushing. 
       But it will never flush success, and the usage of the disk will increase very high.Now other regionservers just use 30% of the disk capacity, the problematic region server will increase to 100%,unless  killing the process.
       What’s more, the region server process cannot be shutdown normally,every time I have to use the KILL -9 command.
       I check the log,the reason why cannot flush is one of the MemstoreFlusher cannot exiting.
       The log is like blow:
       2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13] regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6] regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16] regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33] regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25] regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31] regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29] regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23] regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32] regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38] regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10] regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7] regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12] regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21] regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37] regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24] regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39] regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17] regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3] regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27] regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18] regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4] regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19] regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22] regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9] regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36] regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11] regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14] regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2] regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26] regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30] regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28] regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting 
 
 
 here is the jstack log when I try to shutdown the region server process normally:
 we can see the MemstoreFlusher.34 meet some problems, it can not exiting normally
 (By the way, we can see “locked at snappy”, but it not always locked at snappy,sometime it will locked at other place,so I think snappy is ok, Here I just want to show the MemStoreFlusher meet some problem).
 
 
  "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable [0x00007f08941c3000]
    java.lang.Thread.State: RUNNABLE
 at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native Method)
 at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
 - locked <0x00007f0a5f85c430> (a org.apache.hadoop.io.compress.snappy.SnappyCompressor)
 at org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
 at org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
 at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
 at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
 at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
 at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
 at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
 at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
 - locked <0x00007f09fdcccda0> (a java.lang.Object)
 at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
 at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
 at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
 at java.lang.Thread.run(Thread.java:744)
 
 
 "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on condition [0x00007f08942c5000]
    java.lang.Thread.State: TIMED_WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  <0x00007f09b8e20f80> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
 at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
 at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
 at java.lang.Thread.run(Thread.java:744)
 
 
 
 
 
    I do not know why the memstore cannot flush.
    Is this a bug?
    
 
 
   下面是被转发的邮件:
 
  发件人: quan <32...@qq.com>
 
  主题: memstore flush blocked
 
  日期: 2016年5月30日 GMT+8 上午10:47:43
 
  收件人: user@hbase.apache.org
 
  抄送: 蒲聪 <co...@qunar.com>
 
 
   HI ALL:       Recently,I met a strange problem,  the first Region’s Memstore of one table (the only one) often blocked when flushing.(Both Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to 1.0.1.1,hope to solve the problem,But failed)
       On the web UI, I can see the status shows:  ABORTED(since XXsec ago), Not flushing since already flushing. 
       But it will never flush success, and the usage of the disk will increase very high.Now other regionservers just use 30% of the disk capacity, the problematic region server will increase to 100%,unless  killing the process.
       What’s more, the region server process cannot be shutdown normally,every time I have to use the KILL -9 command.
       I check the log,the reason why cannot flush is one of the MemstoreFlusher cannot exiting.
       The log is like blow:
       2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13] regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6] regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16] regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33] regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25] regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31] regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29] regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23] regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32] regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38] regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10] regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7] regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12] regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21] regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37] regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24] regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39] regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17] regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3] regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27] regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18] regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4] regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19] regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22] regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9] regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36] regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11] regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14] regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2] regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26] regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30] regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28] regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting 
 
 
 here is the jstack log when I try to shutdown the region server process normally:
 we can see the MemstoreFlusher.34 meet some problems, it can not exiting normally
 (By the way, we can see “locked at snappy”, but it not always locked at snappy,sometime it will locked at other place,so I think snappy is ok, Here I just want to show the MemStoreFlusher meet some problem).
 
 
  "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable [0x00007f08941c3000]
    java.lang.Thread.State: RUNNABLE
 at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native Method)
 at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
 - locked <0x00007f0a5f85c430> (a org.apache.hadoop.io.compress.snappy.SnappyCompressor)
 at org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
 at org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
 at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
 at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
 at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
 at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
 at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
 at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
 at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
 at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
 - locked <0x00007f09fdcccda0> (a java.lang.Object)
 at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
 at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
 at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
 at java.lang.Thread.run(Thread.java:744)
 
 
 "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on condition [0x00007f08942c5000]
    java.lang.Thread.State: TIMED_WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  <0x00007f09b8e20f80> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
 at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
 at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
 at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
 at java.lang.Thread.run(Thread.java:744)
 
 
 
 
 
    I do not know why the memstore cannot flush.
    Is this a bug?

Re: Memstore blocking

Posted by Heng Chen <he...@gmail.com>.
Something wrong in snappy Library?

Have you try to not use compression?

2016-06-03 11:13 GMT+08:00 吴国泉wgq <wg...@qunar.com>:

> HI STACK:
>
>    1.   The log is very large,so I pick some of it. But it seems not
> provide valuable info.Here is the region named
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be  can’t flush.
>
>       When the flush Thread works well, The log is like this:
>       2016-05-24 12:38:27,071 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 16681
> 2016-05-24 12:38:37,071 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 8684
> 2016-05-24 12:38:43,753 INFO  [MemStoreFlusher.2] regionserver.HRegion:
> Started memstore flush for
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., current region
> memstore size 305.3 K
> 2016-05-24 12:38:43,753 WARN  [MemStoreFlusher.2] wal.FSHLog: Couldn't
> find oldest seqNum for the region we are about to flush:
> [dd8f92e3c161a8534b30ab17c28ae8be]
> 2016-05-24 12:38:43,816 INFO  [MemStoreFlusher.2]
> regionserver.DefaultStoreFlusher: Flushed, sequenceid=54259, memsize=305.3
> K, hasBloomFilter=true, into tmp file
> hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/.tmp/fddbb05945354d5cbdae4afd24e5bb9d
> 2016-05-24 12:38:43,822 DEBUG [MemStoreFlusher.2]
> regionserver.HRegionFileSystem: Committing store file
> hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/.tmp/fddbb05945354d5cbdae4afd24e5bb9d
> as
> hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/t/fddbb05945354d5cbdae4afd24e5bb9d
> 2016-05-24 12:38:43,837 INFO  [MemStoreFlusher.2] regionserver.HStore:
> Added
> hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/t/fddbb05945354d5cbdae4afd24e5bb9d,
> entries=108, sequenceid=54259, filesize=68.3 K
> 2016-05-24 12:38:43,837 INFO  [MemStoreFlusher.2] regionserver.HRegion:
> Finished memstore flush of ~305.3 K/312664, currentsize=0/0 for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. in 84ms,
> sequenceid=54259, compaction requested=true
>
>       And when the  flush Thread does not  work well,The log just always
> shows :
> 2016-05-25 14:57:02,588 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 18068
> 2016-05-25 14:57:12,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 13165
> 2016-05-25 14:57:20,656 DEBUG [MemStoreFlusher.36] regionserver.HRegion:
> NOT flushing memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
> 2016-05-25 14:57:22,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 5526
> 2016-05-25 14:57:28,113 DEBUG [MemStoreFlusher.34] regionserver.HRegion:
> NOT flushing memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
> 2016-05-25 14:57:32,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 8178
> 2016-05-25 14:57:40,767 DEBUG [MemStoreFlusher.9] regionserver.HRegion:
> NOT flushing memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
> 2016-05-25 14:57:42,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 22068
> 2016-05-25 14:57:52,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 5492
> 2016-05-25 14:58:02,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 10472
> 2016-05-25 14:58:04,655 DEBUG [MemStoreFlusher.23] regionserver.HRegion:
> NOT flushing memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
> 2016-05-25 14:58:12,587 INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 8435
>
>      I check the log before  the region can’t flush, but find nothing
> special from usual.
>      by the way, this table has a heavy load on  writing, so I turn off
> the WAL.
>
>
> 2. It is the  disk is filling faster than on other machines.  Not the
> instat show.
>
> 3.this is the jstack log of the memstoreflusher:
> "MemStoreFlusher.14" prio=10 tid=0x00007fe69487d800 nid=0x387d waiting on
> condition [0x00007fce7118c000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007fd09e031470> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
> "MemStoreFlusher.13" prio=10 tid=0x00007fe69487b800 nid=0x387c runnable
> [0x00007fce7128c000]
>    java.lang.Thread.State: RUNNABLE
> at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
> at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
> - locked <0x00007fcfe3ba1190> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
> at
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
> at
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
> at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
> at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
> at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
> at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
> - locked <0x00007fd8d1778be8> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
> at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> at java.lang.Thread.run(Thread.java:744)
>
> "MemStoreFlusher.12" prio=10 tid=0x00007fe694879800 nid=0x387b waiting on
> condition [0x00007fce7138e000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007fd09e031470> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
> 在 2016年6月3日,上午12:08,Stack <st...@duboce.net>> 写道:
>
> On Wed, Jun 1, 2016 at 3:03 AM, 吴国泉wgq <wgq.wu@qunar.com<mailto:
> wgq.wu@qunar.com>> wrote:
>
> hi all:
>
> 1.Is region always on same machine or do you see this phenomenon on more
> than one machine?
>       Not always on the same machine, but always on the machine which
> hold the first region of a table(the only table that its first region
> cannot flush,when restart the regionserver,  the first region would move to
> another machine)
>
> 2.The RS is ABORTED? Because it can't flush? Is that what it says in the
> log? Can we see the log message around the ABORT?
>       sorry,I did not express clear here. It is the MemStore of the first
> region can’t flush, not the RS.
>       The RS Log is like this:
>       INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 9452
>       DEBUG [MemStoreFlusher.39] regionserver.HRegion
> memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
>
>       And the web UI shows:
>       Aborted flushing .(Not flushing since already flushing.) But  the
> Flusher thread never finish.
>
>
>
>
> Can you back up in the log for this region? It looks like the first flush
> attempt failed or never completed. Perhaps there is a clue in the log
> message from the first attempt as to why the failure. The above log seems
> to be from a later stage, when we have gotten into the unhealthy state
> where we can't flush the first region memstore.
>
>
>
> 3.100% disk only? The CPU does not go up too?  Can we see a thread dump?
> Do jstack -l  PID if you can
>       Only the disk usage(command : df -h) increases faster than other
> machine, not the IO usage. The usage of CPU is very low.
>
>
>
> Sorry. Your disk is filling faster than on other machines? Or the iostat
> shows the disk is doing more seeks than on other machines?
>
>
>
> 4.Any other special configurations going on on this install? Phoenix or
> use of Coprocessors?
>       NO, no phoenix. Only AccessController coprocessor.
>
> 5.If you thread dump a few times, it is always stuck here?
>       Yes, always stuck here. here is the jstack log.(in this log, it is
> the MemStoreFlusher.13 can ’t flush)
>
>
>       PS: As I see. I think this is because the first region cannot
> flush cause the problem. But I do not know why it can’t flush and why just
> the first region of the only table has this problem.
>
>
>
> Please include pointer to the stack trace. You can't have attachments on
> mail to this list.
>
> Thank you,
> St.Ack
>
>
>
>
>
> 在 2016年6月1日,上午3:10,Stack <st...@duboce.net>> 写道:
>
> On Mon, May 30, 2016 at 7:03 PM, 聪聪 <175998806@qq.com<mailto:
> 175998806@qq.com>> wrote:
>
> HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>
>
>
> Is region always on same machine or do you see this phenomenon on more than
> one machine?
>
>
>
>      On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>
>
>
> The RS is ABORTED? Because it can't flush? Is that what it says in the log?
> Can we see the log message around the ABORT?
>
>
>
>      But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>
>
>
> 100% disk only? The CPU does not go up too?
>
> Can we see a thread dump? Do jstack -l  PID if you can.
>
>
>
>      What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>      I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>      The log is like blow:
>      2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
> 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
> 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
> 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
> 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
> 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
> 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
> 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
> 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
> here is the jstack log when I try to shutdown the region server process
> normally:
> we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
> (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
> Any other special configurations going on on this install? Phoenix or use
>
> of Coprocessors?
>
>
>
> "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>   java.lang.Thread.State: RUNNABLE
> at
>
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
> at
>
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
> - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
> at
>
>
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
> at
>
>
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
> at
>
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
> at
>
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
> at
>
>
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> at
>
>
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at
>
>
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
> - locked <0x00007f09fdcccda0> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
> If you thread dump a few times, it is always stuck here?
>
> Thanks,
> St.Ack
>
>
>
>
> "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>   java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
>
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>   I do not know why the memstore cannot flush.
>   Is this a bug?
>
>
>
>  下面是被转发的邮件:
>
> 发件人: quan <32...@qq.com>>
>
> 主题: memstore flush blocked
>
> 日期: 2016年5月30日 GMT+8 上午10:47:43
>
> 收件人: user@hbase.apache.org<ma...@hbase.apache.org>
>
> 抄送: 蒲聪 <co...@qunar.com>>
>
>
>  HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>      On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>      But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>      What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>      I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>      The log is like blow:
>      2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
> 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
> 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
> 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
> 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
> 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
> 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
> 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
> 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
> here is the jstack log when I try to shutdown the region server process
> normally:
> we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
> (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
> "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>   java.lang.Thread.State: RUNNABLE
> at
>
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
> at
>
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
> - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
> at
>
>
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
> at
>
>
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
> at
>
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
> at
>
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
> at
>
>
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
> at
>
>
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> at
>
>
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at
>
>
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
> - locked <0x00007f09fdcccda0> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
> at
>
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> at java.lang.Thread.run(Thread.java:744)
>
>
> "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>   java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
>
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
>
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>   I do not know why the memstore cannot flush.
>   Is this a bug?
>
>
>
> 安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件
>
>
>
>
>
> 安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件
>
>

Re: Memstore blocking

Posted by 吴国泉wgq <wg...@qunar.com>.
HI STACK:

   1.   The log is very large,so I pick some of it. But it seems not provide valuable info.Here is the region named qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be  can’t flush.

      When the flush Thread works well, The log is like this:
      2016-05-24 12:38:27,071 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 16681
2016-05-24 12:38:37,071 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 8684
2016-05-24 12:38:43,753 INFO  [MemStoreFlusher.2] regionserver.HRegion: Started memstore flush for qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., current region memstore size 305.3 K
2016-05-24 12:38:43,753 WARN  [MemStoreFlusher.2] wal.FSHLog: Couldn't find oldest seqNum for the region we are about to flush: [dd8f92e3c161a8534b30ab17c28ae8be]
2016-05-24 12:38:43,816 INFO  [MemStoreFlusher.2] regionserver.DefaultStoreFlusher: Flushed, sequenceid=54259, memsize=305.3 K, hasBloomFilter=true, into tmp file hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/.tmp/fddbb05945354d5cbdae4afd24e5bb9d
2016-05-24 12:38:43,822 DEBUG [MemStoreFlusher.2] regionserver.HRegionFileSystem: Committing store file hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/.tmp/fddbb05945354d5cbdae4afd24e5bb9d as hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/t/fddbb05945354d5cbdae4afd24e5bb9d
2016-05-24 12:38:43,837 INFO  [MemStoreFlusher.2] regionserver.HStore: Added hdfs://cluster-tc-qtrace:8020/hbase/tc_qtrace/data/default/qtrace/dd8f92e3c161a8534b30ab17c28ae8be/t/fddbb05945354d5cbdae4afd24e5bb9d, entries=108, sequenceid=54259, filesize=68.3 K
2016-05-24 12:38:43,837 INFO  [MemStoreFlusher.2] regionserver.HRegion: Finished memstore flush of ~305.3 K/312664, currentsize=0/0 for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. in 84ms, sequenceid=54259, compaction requested=true

      And when the  flush Thread does not  work well,The log just always shows :
2016-05-25 14:57:02,588 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 18068
2016-05-25 14:57:12,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 13165
2016-05-25 14:57:20,656 DEBUG [MemStoreFlusher.36] regionserver.HRegion: NOT flushing memstore for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true, writesEnabled=true
2016-05-25 14:57:22,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 5526
2016-05-25 14:57:28,113 DEBUG [MemStoreFlusher.34] regionserver.HRegion: NOT flushing memstore for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true, writesEnabled=true
2016-05-25 14:57:32,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 8178
2016-05-25 14:57:40,767 DEBUG [MemStoreFlusher.9] regionserver.HRegion: NOT flushing memstore for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true, writesEnabled=true
2016-05-25 14:57:42,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 22068
2016-05-25 14:57:52,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 5492
2016-05-25 14:58:02,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 10472
2016-05-25 14:58:04,655 DEBUG [MemStoreFlusher.23] regionserver.HRegion: NOT flushing memstore for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true, writesEnabled=true
2016-05-25 14:58:12,587 INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 8435

     I check the log before  the region can’t flush, but find nothing special from usual.
     by the way, this table has a heavy load on  writing, so I turn off the WAL.


2. It is the  disk is filling faster than on other machines.  Not the instat show.

3.this is the jstack log of the memstoreflusher:
"MemStoreFlusher.14" prio=10 tid=0x00007fe69487d800 nid=0x387d waiting on condition [0x00007fce7118c000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007fd09e031470> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)

"MemStoreFlusher.13" prio=10 tid=0x00007fe69487b800 nid=0x387c runnable [0x00007fce7128c000]
   java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native Method)
at org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
- locked <0x00007fcfe3ba1190> (a org.apache.hadoop.io.compress.snappy.SnappyCompressor)
at org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
at org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
- locked <0x00007fd8d1778be8> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
at java.lang.Thread.run(Thread.java:744)

"MemStoreFlusher.12" prio=10 tid=0x00007fe694879800 nid=0x387b waiting on condition [0x00007fce7138e000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007fd09e031470> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)

在 2016年6月3日,上午12:08,Stack <st...@duboce.net>> 写道:

On Wed, Jun 1, 2016 at 3:03 AM, 吴国泉wgq <wg...@qunar.com>> wrote:

hi all:

1.Is region always on same machine or do you see this phenomenon on more
than one machine?
      Not always on the same machine, but always on the machine which
hold the first region of a table(the only table that its first region
cannot flush,when restart the regionserver,  the first region would move to
another machine)

2.The RS is ABORTED? Because it can't flush? Is that what it says in the
log? Can we see the log message around the ABORT?
      sorry,I did not express clear here. It is the MemStore of the first
region can’t flush, not the RS.
      The RS Log is like this:
      INFO  [regionserver60020.periodicFlusher]
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
after a delay of 9452
      DEBUG [MemStoreFlusher.39] regionserver.HRegion
memstore for region
qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
writesEnabled=true

      And the web UI shows:
      Aborted flushing .(Not flushing since already flushing.) But  the
Flusher thread never finish.




Can you back up in the log for this region? It looks like the first flush
attempt failed or never completed. Perhaps there is a clue in the log
message from the first attempt as to why the failure. The above log seems
to be from a later stage, when we have gotten into the unhealthy state
where we can't flush the first region memstore.



3.100% disk only? The CPU does not go up too?  Can we see a thread dump?
Do jstack -l  PID if you can
      Only the disk usage(command : df -h) increases faster than other
machine, not the IO usage. The usage of CPU is very low.



Sorry. Your disk is filling faster than on other machines? Or the iostat
shows the disk is doing more seeks than on other machines?



4.Any other special configurations going on on this install? Phoenix or
use of Coprocessors?
      NO, no phoenix. Only AccessController coprocessor.

5.If you thread dump a few times, it is always stuck here?
      Yes, always stuck here. here is the jstack log.(in this log, it is
the MemStoreFlusher.13 can ’t flush)


      PS: As I see. I think this is because the first region cannot
flush cause the problem. But I do not know why it can’t flush and why just
the first region of the only table has this problem.



Please include pointer to the stack trace. You can't have attachments on
mail to this list.

Thank you,
St.Ack





在 2016年6月1日,上午3:10,Stack <st...@duboce.net>> 写道:

On Mon, May 30, 2016 at 7:03 PM, 聪聪 <17...@qq.com>> wrote:

HI ALL:       Recently,I met a strange problem,  the first Region’s
Memstore of one table (the only one) often blocked when flushing.(Both
Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
1.0.1.1,hope to solve the problem,But failed)



Is region always on same machine or do you see this phenomenon on more than
one machine?



     On the web UI, I can see the status shows:  ABORTED(since XXsec
ago), Not flushing since already flushing.



The RS is ABORTED? Because it can't flush? Is that what it says in the log?
Can we see the log message around the ABORT?



     But it will never flush success, and the usage of the disk will
increase very high.Now other regionservers just use 30% of the disk
capacity, the problematic region server will increase to 100%,unless
killing the process.



100% disk only? The CPU does not go up too?

Can we see a thread dump? Do jstack -l  PID if you can.



     What’s more, the region server process cannot be shutdown
normally,every time I have to use the KILL -9 command.
     I check the log,the reason why cannot flush is one of the
MemstoreFlusher cannot exiting.
     The log is like blow:
     2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting


here is the jstack log when I try to shutdown the region server process
normally:
we can see the MemstoreFlusher.34 meet some problems, it can not exiting
normally
(By the way, we can see “locked at snappy”, but it not always locked at
snappy,sometime it will locked at other place,so I think snappy is ok, Here
I just want to show the MemStoreFlusher meet some problem).


Any other special configurations going on on this install? Phoenix or use

of Coprocessors?



"MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
[0x00007f08941c3000]
  java.lang.Thread.State: RUNNABLE
at

org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
Method)
at

org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
- locked <0x00007f0a5f85c430> (a
org.apache.hadoop.io.compress.snappy.SnappyCompressor)
at

org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
at

org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
at

org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
at

org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
at

org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
at

org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
at

org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at

org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
- locked <0x00007f09fdcccda0> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
at

org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
at

org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
at

org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
at java.lang.Thread.run(Thread.java:744)



If you thread dump a few times, it is always stuck here?

Thanks,
St.Ack




"MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
condition [0x00007f08942c5000]
  java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007f09b8e20f80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)





  I do not know why the memstore cannot flush.
  Is this a bug?



 下面是被转发的邮件:

发件人: quan <32...@qq.com>>

主题: memstore flush blocked

日期: 2016年5月30日 GMT+8 上午10:47:43

收件人: user@hbase.apache.org<ma...@hbase.apache.org>

抄送: 蒲聪 <co...@qunar.com>>


 HI ALL:       Recently,I met a strange problem,  the first Region’s
Memstore of one table (the only one) often blocked when flushing.(Both
Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
1.0.1.1,hope to solve the problem,But failed)
     On the web UI, I can see the status shows:  ABORTED(since XXsec
ago), Not flushing since already flushing.
     But it will never flush success, and the usage of the disk will
increase very high.Now other regionservers just use 30% of the disk
capacity, the problematic region server will increase to 100%,unless
killing the process.
     What’s more, the region server process cannot be shutdown
normally,every time I have to use the KILL -9 command.
     I check the log,the reason why cannot flush is one of the
MemstoreFlusher cannot exiting.
     The log is like blow:
     2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting


here is the jstack log when I try to shutdown the region server process
normally:
we can see the MemstoreFlusher.34 meet some problems, it can not exiting
normally
(By the way, we can see “locked at snappy”, but it not always locked at
snappy,sometime it will locked at other place,so I think snappy is ok, Here
I just want to show the MemStoreFlusher meet some problem).


"MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
[0x00007f08941c3000]
  java.lang.Thread.State: RUNNABLE
at

org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
Method)
at

org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
- locked <0x00007f0a5f85c430> (a
org.apache.hadoop.io.compress.snappy.SnappyCompressor)
at

org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
at

org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
at

org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
at

org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
at

org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
at

org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
at

org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
at

org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
at

org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at

org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
- locked <0x00007f09fdcccda0> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
at

org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
at

org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
at

org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
at java.lang.Thread.run(Thread.java:744)


"MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
condition [0x00007f08942c5000]
  java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007f09b8e20f80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at

org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)





  I do not know why the memstore cannot flush.
  Is this a bug?



安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件





安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件


Re: Memstore blocking

Posted by Stack <st...@duboce.net>.
On Wed, Jun 1, 2016 at 3:03 AM, 吴国泉wgq <wg...@qunar.com> wrote:

> hi all:
>
> 1.Is region always on same machine or do you see this phenomenon on more
> than one machine?
>        Not always on the same machine, but always on the machine which
> hold the first region of a table(the only table that its first region
> cannot flush,when restart the regionserver,  the first region would move to
> another machine)
>
>  2.The RS is ABORTED? Because it can't flush? Is that what it says in the
> log? Can we see the log message around the ABORT?
>        sorry,I did not express clear here. It is the MemStore of the first
> region can’t flush, not the RS.
>        The RS Log is like this:
>        INFO  [regionserver60020.periodicFlusher]
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
> flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be.
> after a delay of 9452
>        DEBUG [MemStoreFlusher.39] regionserver.HRegion: NOT flushing
> memstore for region
> qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true,
> writesEnabled=true
>
>        And the web UI shows:
>        Aborted flushing .(Not flushing since already flushing.) But  the
> Flusher thread never finish.
>
>


Can you back up in the log for this region? It looks like the first flush
attempt failed or never completed. Perhaps there is a clue in the log
message from the first attempt as to why the failure. The above log seems
to be from a later stage, when we have gotten into the unhealthy state
where we can't flush the first region memstore.



>  3.100% disk only? The CPU does not go up too?  Can we see a thread dump?
> Do jstack -l  PID if you can
>        Only the disk usage(command : df -h) increases faster than other
> machine, not the IO usage. The usage of CPU is very low.
>
>
>
Sorry. Your disk is filling faster than on other machines? Or the iostat
shows the disk is doing more seeks than on other machines?



>  4.Any other special configurations going on on this install? Phoenix or
> use of Coprocessors?
>        NO, no phoenix. Only AccessController coprocessor.
>
>  5.If you thread dump a few times, it is always stuck here?
>        Yes, always stuck here. here is the jstack log.(in this log, it is
> the MemStoreFlusher.13 can ’t flush)
>
>
>        PS: As I see. I think this is because the first region cannot
> flush cause the problem. But I do not know why it can’t flush and why just
> the first region of the only table has this problem.
>


Please include pointer to the stack trace. You can't have attachments on
 mail to this list.

Thank you,
St.Ack



>
>
> 在 2016年6月1日,上午3:10,Stack <st...@duboce.net> 写道:
>
> On Mon, May 30, 2016 at 7:03 PM, 聪聪 <17...@qq.com> wrote:
>
> HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>
>
>
> Is region always on same machine or do you see this phenomenon on more than
> one machine?
>
>
>
>       On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>
>
>
> The RS is ABORTED? Because it can't flush? Is that what it says in the log?
> Can we see the log message around the ABORT?
>
>
>
>       But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>
>
>
> 100% disk only? The CPU does not go up too?
>
> Can we see a thread dump? Do jstack -l  PID if you can.
>
>
>
>       What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>       I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>       The log is like blow:
>       2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
> 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
> 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
> 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
> 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
> 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
> 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
> 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
> 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
> here is the jstack log when I try to shutdown the region server process
> normally:
> we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
> (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
> Any other special configurations going on on this install? Phoenix or use
>
> of Coprocessors?
>
>
>
>  "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>    java.lang.Thread.State: RUNNABLE
> at
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
> at
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
> - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
> at
>
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
> at
>
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
> at
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
> at
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
> at
>
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> at
>
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at
>
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
> - locked <0x00007f09fdcccda0> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
> at
>
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
> If you thread dump a few times, it is always stuck here?
>
> Thanks,
> St.Ack
>
>
>
>
> "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>    I do not know why the memstore cannot flush.
>    Is this a bug?
>
>
>
>   下面是被转发的邮件:
>
>  发件人: quan <32...@qq.com>
>
>  主题: memstore flush blocked
>
>  日期: 2016年5月30日 GMT+8 上午10:47:43
>
>  收件人: user@hbase.apache.org
>
>  抄送: 蒲聪 <co...@qunar.com>
>
>
>   HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>       On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>       But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>       What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>       I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>       The log is like blow:
>       2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
> 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
> 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
> 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
> 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
> 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
> 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
> 2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
> 2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
> 2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
> 2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
> 2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
> 2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
> 2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
> here is the jstack log when I try to shutdown the region server process
> normally:
> we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
> (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
>  "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>    java.lang.Thread.State: RUNNABLE
> at
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
> at
>
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
> - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
> at
>
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
> at
>
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
> at
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
> at
>
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
> at
>
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
> at
>
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> at
>
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at
>
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
> - locked <0x00007f09fdcccda0> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
> at
>
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> at java.lang.Thread.run(Thread.java:744)
>
>
> "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
> at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
> at
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>    I do not know why the memstore cannot flush.
>    Is this a bug?
>
>
>
> 安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件
>
>
>

Re: Memstore blocking

Posted by 吴国泉wgq <wg...@qunar.com>.
hi all:

1.Is region always on same machine or do you see this phenomenon on more than one machine?
       Not always on the same machine, but always on the machine which hold the first region of a table(the only table that its first region cannot flush,when restart the regionserver,  the first region would move to another machine)

 2.The RS is ABORTED? Because it can't flush? Is that what it says in the log? Can we see the log message around the ABORT?
       sorry,I did not express clear here. It is the MemStore of the first region can’t flush, not the RS.
       The RS Log is like this:
       INFO  [regionserver60020.periodicFlusher] regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be. after a delay of 9452
       DEBUG [MemStoreFlusher.39] regionserver.HRegion: NOT flushing memstore for region qtrace,,1458012479440.dd8f92e3c161a8534b30ab17c28ae8be., flushing=true, writesEnabled=true

       And the web UI shows:
       Aborted flushing .(Not flushing since already flushing.) But  the Flusher thread never finish.

 3.100% disk only? The CPU does not go up too?  Can we see a thread dump? Do jstack -l  PID if you can
       Only the disk usage(command : df -h) increases faster than other machine, not the IO usage. The usage of CPU is very low.


 4.Any other special configurations going on on this install? Phoenix or use of Coprocessors?
       NO, no phoenix. Only AccessController coprocessor.

 5.If you thread dump a few times, it is always stuck here?
       Yes, always stuck here. here is the jstack log.(in this log, it is the MemStoreFlusher.13 can ’t flush)


       PS: As I see. I think this is because the first region cannot flush cause the problem. But I do not know why it can’t flush and why just the first region of the only table has this problem.


在 2016年6月1日,上午3:10,Stack <st...@duboce.net>> 写道:

On Mon, May 30, 2016 at 7:03 PM, 聪聪 <17...@qq.com>> wrote:

HI ALL:       Recently,I met a strange problem,  the first Region’s
Memstore of one table (the only one) often blocked when flushing.(Both
Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
1.0.1.1,hope to solve the problem,But failed)



Is region always on same machine or do you see this phenomenon on more than
one machine?



      On the web UI, I can see the status shows:  ABORTED(since XXsec
ago), Not flushing since already flushing.



The RS is ABORTED? Because it can't flush? Is that what it says in the log?
Can we see the log message around the ABORT?



      But it will never flush success, and the usage of the disk will
increase very high.Now other regionservers just use 30% of the disk
capacity, the problematic region server will increase to 100%,unless
killing the process.



100% disk only? The CPU does not go up too?

Can we see a thread dump? Do jstack -l  PID if you can.



      What’s more, the region server process cannot be shutdown
normally,every time I have to use the KILL -9 command.
      I check the log,the reason why cannot flush is one of the
MemstoreFlusher cannot exiting.
      The log is like blow:
      2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting


here is the jstack log when I try to shutdown the region server process
normally:
we can see the MemstoreFlusher.34 meet some problems, it can not exiting
normally
(By the way, we can see “locked at snappy”, but it not always locked at
snappy,sometime it will locked at other place,so I think snappy is ok, Here
I just want to show the MemStoreFlusher meet some problem).


Any other special configurations going on on this install? Phoenix or use
of Coprocessors?



 "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
[0x00007f08941c3000]
   java.lang.Thread.State: RUNNABLE
at
org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
Method)
at
org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
- locked <0x00007f0a5f85c430> (a
org.apache.hadoop.io.compress.snappy.SnappyCompressor)
at
org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
at
org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
at
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
at
org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
- locked <0x00007f09fdcccda0> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
at
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
at java.lang.Thread.run(Thread.java:744)



If you thread dump a few times, it is always stuck here?

Thanks,
St.Ack




"MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
condition [0x00007f08942c5000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007f09b8e20f80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)





   I do not know why the memstore cannot flush.
   Is this a bug?



  下面是被转发的邮件:

 发件人: quan <32...@qq.com>>

 主题: memstore flush blocked

 日期: 2016年5月30日 GMT+8 上午10:47:43

 收件人: user@hbase.apache.org<ma...@hbase.apache.org>

 抄送: 蒲聪 <co...@qunar.com>>


  HI ALL:       Recently,I met a strange problem,  the first Region’s
Memstore of one table (the only one) often blocked when flushing.(Both
Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
1.0.1.1,hope to solve the problem,But failed)
      On the web UI, I can see the status shows:  ABORTED(since XXsec
ago), Not flushing since already flushing.
      But it will never flush success, and the usage of the disk will
increase very high.Now other regionservers just use 30% of the disk
capacity, the problematic region server will increase to 100%,unless
killing the process.
      What’s more, the region server process cannot be shutdown
normally,every time I have to use the KILL -9 command.
      I check the log,the reason why cannot flush is one of the
MemstoreFlusher cannot exiting.
      The log is like blow:
      2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting


here is the jstack log when I try to shutdown the region server process
normally:
we can see the MemstoreFlusher.34 meet some problems, it can not exiting
normally
(By the way, we can see “locked at snappy”, but it not always locked at
snappy,sometime it will locked at other place,so I think snappy is ok, Here
I just want to show the MemStoreFlusher meet some problem).


 "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
[0x00007f08941c3000]
   java.lang.Thread.State: RUNNABLE
at
org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
Method)
at
org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
- locked <0x00007f0a5f85c430> (a
org.apache.hadoop.io.compress.snappy.SnappyCompressor)
at
org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
at
org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
at
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
at
org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
- locked <0x00007f09fdcccda0> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
at
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
at java.lang.Thread.run(Thread.java:744)


"MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
condition [0x00007f08942c5000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00007f09b8e20f80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
at java.lang.Thread.run(Thread.java:744)





   I do not know why the memstore cannot flush.
   Is this a bug?


安全提示:本邮件非QUNAR内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 secteam发送邮件


Re: Memstore blocking

Posted by Stack <st...@duboce.net>.
On Mon, May 30, 2016 at 7:03 PM, 聪聪 <17...@qq.com> wrote:

> HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>


Is region always on same machine or do you see this phenomenon on more than
one machine?



>        On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>


The RS is ABORTED? Because it can't flush? Is that what it says in the log?
Can we see the log message around the ABORT?



>        But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>


100% disk only? The CPU does not go up too?

Can we see a thread dump? Do jstack -l  PID if you can.



>        What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>        I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>        The log is like blow:
>        2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
>  2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
>  2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
>  2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
>  2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
>  2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
>  2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
>  2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
>  2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
>  here is the jstack log when I try to shutdown the region server process
> normally:
>  we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
>  (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
> Any other special configurations going on on this install? Phoenix or use
of Coprocessors?



>   "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>     java.lang.Thread.State: RUNNABLE
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
>  - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
>  at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
>  - locked <0x00007f09fdcccda0> (a java.lang.Object)
>  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
>  at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
>  at java.lang.Thread.run(Thread.java:744)
>
>

If you thread dump a few times, it is always stuck here?

Thanks,
St.Ack



>
>  "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>     java.lang.Thread.State: TIMED_WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>  at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>     I do not know why the memstore cannot flush.
>     Is this a bug?
>
>
>
>    下面是被转发的邮件:
>
>   发件人: quan <32...@qq.com>
>
>   主题: memstore flush blocked
>
>   日期: 2016年5月30日 GMT+8 上午10:47:43
>
>   收件人: user@hbase.apache.org
>
>   抄送: 蒲聪 <co...@qunar.com>
>
>
>    HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>        On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>        But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>        What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>        I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>        The log is like blow:
>        2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
>  2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
>  2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
>  2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
>  2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
>  2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
>  2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
>  2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
>  2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
>  here is the jstack log when I try to shutdown the region server process
> normally:
>  we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
>  (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
>   "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>     java.lang.Thread.State: RUNNABLE
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
>  - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
>  at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
>  - locked <0x00007f09fdcccda0> (a java.lang.Object)
>  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
>  at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>  "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>     java.lang.Thread.State: TIMED_WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>  at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>     I do not know why the memstore cannot flush.
>     Is this a bug?