You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Anoop Sam John (Jira)" <ji...@apache.org> on 2019/12/02 04:59:00 UTC

[jira] [Commented] (HBASE-23066) Allow cache on write during compactions when prefetching is enabled

    [ https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985822#comment-16985822 ] 

Anoop Sam John commented on HBASE-23066:
----------------------------------------

On the new conf name 'hbase.rs.prefetchcompactedblocksonwrite', it would be better not to call it or relate it to prefetch.  Cache on write related config is there. It should have been named like cache on flush. But leave that aside. At least this conf can clear say cache on compaction.  So it is clear that the caching happens as part of compaction write. Also IMHO no need to check that based on whether prefetch is on or not!  Make this conf name and doc clear what it is doing and what is the size expectations.  Any way we have another jira to discuss whether all compacted files to follow this cache on write or not.

> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
>                 Key: HBASE-23066
>                 URL: https://issues.apache.org/jira/browse/HBASE-23066
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction, regionserver
>    Affects Versions: 1.4.10
>            Reporter: Jacob LeBlanc
>            Assignee: Jacob LeBlanc
>            Priority: Minor
>             Fix For: 2.3.0, 1.6.0
>
>         Attachments: HBASE-23066.patch, performance_results.png, prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are small enough to fit into a cache (or the cache is large enough), prefetchOnOpen can be enabled to make the entire table available in cache after the initial region opening is completed. Any new data can also be guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction. We found very poor performance after compactions for tables under heavy read load and a slower backing filesystem (S3). After a compaction the prefetching threads need to compete with threads servicing read requests and get constantly blocked as a result. 
> This is a proposal to introduce a new cache configuration option that would cache blocks on write during compaction for any column family that has prefetch enabled. This would virtually guarantee all blocks are kept in cache after the initial prefetch on open is completed allowing for guaranteed steady read performance despite a slow backing file system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)