You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Moran (Jira)" <ji...@apache.org> on 2024/04/02 06:14:00 UTC

[jira] [Commented] (HBASE-28227) Tables to which Stripe Compaction policy is applied cannot be forced to trigger Major Compaction.

    [ https://issues.apache.org/jira/browse/HBASE-28227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833022#comment-17833022 ] 

Moran commented on HBASE-28227:
-------------------------------

[~Xiaolin Ha] Major compaction is used primarily to clean up outdated versions of data, reduce file count, and improve localization rates.Especially when the node expands, the data localization cannot be performed through major compaction, resulting in increased read latency.

> Tables to which Stripe Compaction policy is applied cannot be forced to trigger Major Compaction.
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-28227
>                 URL: https://issues.apache.org/jira/browse/HBASE-28227
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 2.2.6
>            Reporter: longping_jie
>            Priority: Major
>
>     There is a table and the Stripe Compaction strategy is applied. Each region has an average value of 40G and is divided into 8 Stripes. Each Stripe is 5G. The business deletes a large amount of data. Manually triggering major compaction on the entire table and a single region does not work and cannot be selected.
>     After reading the source code, the merging strategy applied under each Stripe is ExploringCompactionPolicy. This strategy has a key point. It filters the Store file list of a single Stripe. In the candidate file list, as long as there is a file that is too large in size and meets the condition, fileSize > (totalFileSize - fileSize) * (hbase.hstore.compaction.ratio default value 1.2), files will not be filtered out to participate in major compaction.
>     It is necessary to support a forced merging mechanism. For scenarios where a large amount of data is deleted, or where bulkload exists, you can explicitly pass in a parameter such as foreMajor when manually triggering the major, and then perform forced Major Compaction in Stripe units to support the data. Clean up.
>     



--
This message was sent by Atlassian Jira
(v8.20.10#820010)