You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2022/11/10 11:33:00 UTC

[jira] [Commented] (HIVE-26674) REBALANCE type compaction

    [ https://issues.apache.org/jira/browse/HIVE-26674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631604#comment-17631604 ] 

László Bodor commented on HIVE-26674:
-------------------------------------

please add details into jira description about what REBALANCE compaction exactly is, thanks! (PR description points to this ticket btw :) )

> REBALANCE type compaction
> -------------------------
>
>                 Key: HIVE-26674
>                 URL: https://issues.apache.org/jira/browse/HIVE-26674
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Végh
>            Assignee: László Végh
>            Priority: Major
>              Labels: compaction
>
> A new compaction type is required for implicitly bucketed tables. These tables can have balancing issues over time, in a way that the first few buckets contain the majority of the data, while the buckets with higher index contain less and less data. As a result, query performance will drop over time on these unbalanced tables. To solve this issue, the data periodically needs to be re-balanced among the buckets. The plain is to do this via a new RE-BALANCING compaction. This compaction can be issued either manually by users, or automatically by the Initiator. The automatic re-balancing compaction must be based on evaluating a set of thresholds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)