You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2022/06/24 11:48:00 UTC

[jira] [Updated] (HUDI-4313) Support LogFileModTimeBasedCompactionStrategy

     [ https://issues.apache.org/jira/browse/HUDI-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raymond Xu updated HUDI-4313:
-----------------------------
    Issue Type: New Feature  (was: Improvement)

> Support LogFileModTimeBasedCompactionStrategy
> ---------------------------------------------
>
>                 Key: HUDI-4313
>                 URL: https://issues.apache.org/jira/browse/HUDI-4313
>             Project: Apache Hudi
>          Issue Type: New Feature
>          Components: compaction, table-service
>            Reporter: sivabalan narayanan
>            Priority: Major
>
> we need to support a new compaction strategy called LogFileModTimeBasedCompactionStrategy.
> Using this strategy, we want to choose the file slice whose's earliest log file mod time for compaction. 
> This will be similar to LogFileSizeBasedCompactionStrategy, just that instead of comparing total log files size for a given file slice, we will use earliest mod time for a given file slice. 
> The goal is to compact some part of the whole change set (say, 20%) in one batch.
> Compaction plan for a next batch should include incomplete operations from the previous plans.
> Operations should be processed in order of earliest log file modification time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)