You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2022/09/13 20:30:00 UTC

[jira] [Created] (HBASE-27370) Avoid decompressing blocks when reading from bucket cache prefetch threads

Wellington Chevreuil created HBASE-27370:
--------------------------------------------

             Summary: Avoid decompressing blocks when reading from bucket cache prefetch threads 
                 Key: HBASE-27370
                 URL: https://issues.apache.org/jira/browse/HBASE-27370
             Project: HBase
          Issue Type: Improvement
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


When prefetching blocks into bucket cache, we had observed a consistent CPU usage around 70% with no other workloads ongoing. For large bucket caches (i.e. when using file based bucket cache), the prefetch can last for sometime and having such a high CPU usage may impact the database usage by client applications.

Further analysis of the prefetch threads stack trace showed that very often, decompress logic is being executed by these threads:
{noformat}
"hfile-prefetch-1654895061122" #234 daemon prio=5 os_prio=0 tid=0x0000557bb2907000 nid=0x406d runnable [0x00007f294a504000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native Method)
        at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompress(SnappyDecompressor.java:235)
        at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        - locked <0x00000002d24c0ae8> (a java.io.BufferedInputStream)
        at org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:105)
        at org.apache.hadoop.hbase.io.compress.Compression.decompress(Compression.java:465)
        at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:90)
        at org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:650)
        at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1342) {noformat}

This is because *HFileReaderImpl.readBlock* is always decompressing blocks even when *hbase.block.data.cachecompressed* is set to true. 

This patch proposes an alternative flag to differentiate prefetch from normal reads, so that doesn't decompress DATA blocks when prefetching with  *hbase.block.data.cachecompressed* set to true. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)