You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2022/04/05 05:58:00 UTC
[jira] [Created] (HUDI-3796) Implement layout to filter out uncommitted log files without reading the log blocks
Ethan Guo created HUDI-3796:
-------------------------------
Summary: Implement layout to filter out uncommitted log files without reading the log blocks
Key: HUDI-3796
URL: https://issues.apache.org/jira/browse/HUDI-3796
Project: Apache Hudi
Issue Type: Improvement
Components: writer-core
Reporter: Ethan Guo
Fix For: 0.12.0
Related: HUDI-3637
At high level, getLatestFileSlices() is going to fetch the latest file slices for committed base files and filter out any file slices with the uncommitted base instant time. The uncommitted log files in the latest file slices may be included, and they are skipped while doing log reading and merging, i.e., the logic in "AbstractHoodieLogRecordReader".
We can use log instant time instead of base instant time for the log file name so that it is able to filter out uncommitted log files without reading the log blocks beforehand.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)