You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2023/03/30 02:08:00 UTC

[jira] [Updated] (HUDI-5420) Fix metadata table validator to exclude uncommitted log files in successful deltacommits

     [ https://issues.apache.org/jira/browse/HUDI-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan updated HUDI-5420:
--------------------------------------
    Fix Version/s: 0.12.3

> Fix metadata table validator to exclude uncommitted log files in successful deltacommits
> ----------------------------------------------------------------------------------------
>
>                 Key: HUDI-5420
>                 URL: https://issues.apache.org/jira/browse/HUDI-5420
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: writer-core
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0, 0.12.3
>
>
> When a write transaction writes uncommitted log files in a delta commit, e.g., due to Spark task retries, these log files stay in the file system after the successful delta commit for some time (unlike uncommitted base files which are deleted based on the markers).  The delta commit metadata does not contain these log files, and the metadata table does not contain these entries either.  Currently, the metadata table validator does not consider such valid case for discrepancy and thus throws errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)