You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Danil Lipovoy (Jira)" <ji...@apache.org> on 2020/05/05 06:31:00 UTC

[jira] [Commented] (HBASE-23887) BlockCache performance improve

    [ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099566#comment-17099566 ] 

Danil Lipovoy commented on HBASE-23887:
---------------------------------------

Added parameters that help to control of eviction process:
hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that avoid of putting data to BlockCache
hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that avoid of putting data to BlockCache
By default: if 10 times (100 secunds) evicted more than 10 MB (each  time) then we start to skip 50% of data blocks.
When heavy evitions process end then it will put into BlockCache all blocks again.


> BlockCache performance improve
> ------------------------------
>
>                 Key: HBASE-23887
>                 URL: https://issues.apache.org/jira/browse/HBASE-23887
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache, Performance
>            Reporter: Danil Lipovoy
>            Priority: Minor
>         Attachments: 1582787018434_rs_metrics.jpg, 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png
>
>
> НГHi!
> I first time here, correct me please if something wrong.
> I want propose how to improve performance when data in HFiles much more than BlockChache (usual story in BigData). The idea - caching only part of DATA blocks. It is good becouse LruBlockCache starts to work and save huge amount of GC. See the picture in attachment with test below. Requests per second is higher, GC is lower.
>  
> The key point of the code:
> Added the parameter: *hbase.lru.cache.data.block.percent* which by default = 100
>  
> But if we set it 0-99, then will work the next logic:
>  
>  
> {code:java}
> public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {   
>   if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
>     if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
>       return;    
> ... 
> // the same code as usual
> }
> {code}
>  
>  
> Descriptions of the test:
> 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 4 RegionServers
> 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)
> Total BlockCache Size = 48 Gb (8 % of data in HFiles)
> Random read in 20 threads
>  
> I am going to make Pull Request, hope it is right way to make some contribution in this cool product.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)