You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2022/09/24 05:33:00 UTC

[jira] [Commented] (HBASE-27386) Use encoded size for calculating compression ratio in block size predicator

    [ https://issues.apache.org/jira/browse/HBASE-27386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608947#comment-17608947 ] 

Hudson commented on HBASE-27386:
--------------------------------

Results for branch branch-2
	[build #650 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/650/]: (x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/650/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/650/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/650/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/650/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Use encoded size for calculating compression ratio in block size predicator
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-27386
>                 URL: https://issues.apache.org/jira/browse/HBASE-27386
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha-3
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>             Fix For: 2.6.0, 3.0.0-alpha-4
>
>
> In HBASE-27264 we had introduced the notion of block size predicators to define hfile block boundaries when writing a new hfile, and provided the
> PreviousBlockCompressionRatePredicator implementation for calculating block sizes based on a compression ratio. It was using the raw data size written to the block so far to calculate the compression ratio, but in the case where encoding is enabled, this could lead to a very high compression ratio and therefore, larger block sizes. We should use the encoded size to calculate compression ratio, instead.
> Here's a example scenario:
> 1) Sample block size when not using the  PreviousBlockCompressionRatePredicator as implemented by HBASE-27264:
> {noformat}
> onDiskSizeWithoutHeader=6613, uncompressedSizeWithoutHeader=32928 {noformat}
> 2) Sample block size when using PreviousBlockCompressionRatePredicator as implemented by HBASE-27264 (uses raw data size to calculate compression rate):
> {noformat}
> onDiskSizeWithoutHeader=126920, uncompressedSizeWithoutHeader=655393
> {noformat}
> 3) Sample block size when using PreviousBlockCompressionRatePredicator with encoded size for calculating compression rate:
> {noformat}
> onDiskSizeWithoutHeader=54299, uncompressedSizeWithoutHeader=328051
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)