You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2022/08/01 12:14:00 UTC

[jira] [Created] (HBASE-27264) Add options to consider compressed size when delimiting blocks during hfile writes

Wellington Chevreuil created HBASE-27264:
--------------------------------------------

             Summary: Add options to consider compressed size when delimiting blocks during hfile writes
                 Key: HBASE-27264
                 URL: https://issues.apache.org/jira/browse/HBASE-27264
             Project: HBase
          Issue Type: New Feature
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


In HBASE-27232 we had modified "hbase.writer.unified.encoded.blocksize.ratio" property soo that it can allow for the encoded size to be considered when delimiting hfiles blocks during writes.

Here we propose two additional properties,"hbase.block.size.limit.compressed" and  "hbase.block.size.max.compressed" that would allow for consider the compressed size (if compression is in use) for delimiting blocks during hfile writing. When compression is enabled, certain datasets can have very high compression efficiency, so that the default 64KB block size and 10GB max file size can lead to hfiles with very large number of blocks. 

In this proposal, "hbase.block.size.limit.compressed" is a boolean flag that switches to compressed size for delimiting blocks, and "hbase.block.size.max.compressed" is an int with the limit, in bytes for the compressed block size, in order to avoid very large uncompressed blocks (defaulting to 320KB).

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)