You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tak-Lon (Stephen) Wu (Jira)" <ji...@apache.org> on 2021/09/09 18:50:00 UTC

[jira] [Created] (HBASE-26274) Create an option to reintroduce BlockCache to mapreduce job

Tak-Lon (Stephen) Wu created HBASE-26274:
--------------------------------------------

             Summary: Create an option to reintroduce BlockCache to mapreduce job
                 Key: HBASE-26274
                 URL: https://issues.apache.org/jira/browse/HBASE-26274
             Project: HBase
          Issue Type: Bug
          Components: BlockCache, HFile, mapreduce
    Affects Versions: 2.4.6, 3.0.0-alpha-1
            Reporter: Tak-Lon (Stephen) Wu


In HBASE-21498 (see [this commit|https://github.com/apache/hbase/commit/27a0f205c52f83fe7500ee2ffc6cf6582f565a63#diff-8a3e39e6df1afe47811fc17702da598fe0d80496d66e579bea4bd224c6d8da03R218], it change the behavior that only region server can initialize on-heap BlockCache/LruBlockCache, this should be the right change for HMaster. 

Other downstream dependency that uses getScanner from a file-based region and read HStore/HFile lost the BlockCache for caching INDEX/LEAF_INDEX (at least still a problem with HBase-2.4) after this change (it worked before) and caused performance impact with 2x slower.

One way to bring back the performance is to allow non-RS and non-HMaster can use a compact version of blockcache with smaller memory and less hbase internal configuration. 

Or if we can find a way to cache or skip reading the same {{LEAF_INDEX}} when scanning the DATA block with HFile. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)