You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2022/08/01 12:14:00 UTC

[jira] [Work started] (HBASE-27264) Add options to consider compressed size when delimiting blocks during hfile writes

     [ https://issues.apache.org/jira/browse/HBASE-27264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-27264 started by Wellington Chevreuil.
----------------------------------------------------
> Add options to consider compressed size when delimiting blocks during hfile writes
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-27264
>                 URL: https://issues.apache.org/jira/browse/HBASE-27264
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>
> In HBASE-27232 we had modified "hbase.writer.unified.encoded.blocksize.ratio" property soo that it can allow for the encoded size to be considered when delimiting hfiles blocks during writes.
> Here we propose two additional properties,"hbase.block.size.limit.compressed" and  "hbase.block.size.max.compressed" that would allow for consider the compressed size (if compression is in use) for delimiting blocks during hfile writing. When compression is enabled, certain datasets can have very high compression efficiency, so that the default 64KB block size and 10GB max file size can lead to hfiles with very large number of blocks. 
> In this proposal, "hbase.block.size.limit.compressed" is a boolean flag that switches to compressed size for delimiting blocks, and "hbase.block.size.max.compressed" is an int with the limit, in bytes for the compressed block size, in order to avoid very large uncompressed blocks (defaulting to 320KB).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)