You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Zhaojing Yu (Jira)" <ji...@apache.org> on 2022/10/01 12:20:00 UTC

[jira] [Updated] (HUDI-3796) Implement layout to filter out uncommitted log files without reading the log blocks

     [ https://issues.apache.org/jira/browse/HUDI-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhaojing Yu updated HUDI-3796:
------------------------------
    Fix Version/s: 0.13.0
                       (was: 0.12.1)

> Implement layout to filter out uncommitted log files without reading the log blocks
> -----------------------------------------------------------------------------------
>
>                 Key: HUDI-3796
>                 URL: https://issues.apache.org/jira/browse/HUDI-3796
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: writer-core
>            Reporter: Ethan Guo
>            Assignee: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.13.0
>
>
> Related: HUDI-3637
> At high level, getLatestFileSlices() is going to fetch the latest file slices for committed base files and filter out any file slices with the uncommitted base instant time.  The uncommitted log files in the latest file slices may be included, and they are skipped while doing log reading and merging, i.e., the logic in "AbstractHoodieLogRecordReader".
> We can use log instant time instead of base instant time for the log file name so that it is able to filter out uncommitted log files without reading the log blocks beforehand.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)