You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/04/09 20:24:00 UTC

[jira] [Updated] (HUDI-3840) Warn logs about not able to read replace commit metadata

     [ https://issues.apache.org/jira/browse/HUDI-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan updated HUDI-3840:
--------------------------------------
    Fix Version/s: 0.12.0

> Warn logs about not able to read replace commit metadata 
> ---------------------------------------------------------
>
>                 Key: HUDI-3840
>                 URL: https://issues.apache.org/jira/browse/HUDI-3840
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: spark
>            Reporter: sivabalan narayanan
>            Priority: Major
>             Fix For: 0.12.0
>
>
> I was trying out spark streaming sink w/ hudi and saw warn logs as below. 
> {code:java}
> 22/04/09 15:54:16 WARN AbstractTableFileSystemView: Could not read commit details from /tmp/hudi_streaming_kafka/COPY_ON_WRITE/.hoodie/20220409154917240.replacecommit
> 22/04/09 15:54:16 WARN AbstractTableFileSystemView: Could not read commit details from /tmp/hudi_streaming_kafka/COPY_ON_WRITE/.hoodie/20220409155011647.replacecommit {code}
> But ran some validations and ensured data was intact. Further investigation revealed that, this happens just after archival, where in the replace commit shown above were part of the list of instants that got archived. So, may be active timeline reloading is missed somewhere. Since its a warn log and does not cause any correctness issue, filing a low priority ticket. 
>  
> Steps to repo:
> spark streaming write to Hudi COW table w/ async clustering. make archival aggressive and you should see these logs at some point
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)