You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Danny Chen (Jira)" <ji...@apache.org> on 2023/04/20 09:49:00 UTC

[jira] [Closed] (HUDI-6084) Ensure write operations to MDT do not absorb failures

     [ https://issues.apache.org/jira/browse/HUDI-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Danny Chen closed HUDI-6084.
----------------------------
    Fix Version/s: 0.13.1
                   0.14.0
       Resolution: Fixed

Fixed via master branch: ab86512e770531103dabb555f8f06d0a55214e5d

> Ensure write operations to MDT do not absorb failures
> -----------------------------------------------------
>
>                 Key: HUDI-6084
>                 URL: https://issues.apache.org/jira/browse/HUDI-6084
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Prashant Wason
>            Assignee: Prashant Wason
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.13.1, 0.14.0
>
>
> Issue 1:
> When we call compaction on MDT, we do not check the return value. Compaction operation may have had errors reported in the WriteStatus. This will cause missing data in MDT.
> MDT operations should never succeed in case of errors. 
> Issue 2:
> Once a deltacommit has completed, the WriteStatus has been used to finalize the write and write the deltacommit action. The code was collecting the WriteStatus on the driver side to check for any errors that occurred during the writing. Since MDT write config has autoCommit, if there were any errors then there is no value of checking them at this stage since the deltacommit has already completed. Also, the write status RDD may have been unpersisted and if a cached value is not available then it will lead to re-writing of the deltacommit.
>  
> Fix:
> MDT uses FailOnFirstErrorWriteStatus which is designed to throw an exception when the first write error is detected. Hence, we do not need to check for write errors explicitly. If any write errors would have occurred then the write itself would not have completed and thrown an exception.
> Also, we do not need to check the WriteStatus after commit has completed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)