You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "HunterHunter (Jira)" <ji...@apache.org> on 2022/06/22 03:19:00 UTC

[jira] [Closed] (HUDI-4271) Throw NoSuchElementException: FileID xx of partition path xx does not exist. when execute HoodieMergeHandle.getLatestBaseFile but FileID is exist in path.

     [ https://issues.apache.org/jira/browse/HUDI-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

HunterHunter closed HUDI-4271.
------------------------------
    Fix Version/s: 0.12.0
       Resolution: Fixed

> Throw NoSuchElementException: FileID xx of partition path xx does not exist. when execute HoodieMergeHandle.getLatestBaseFile but FileID is exist in path. 
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-4271
>                 URL: https://issues.apache.org/jira/browse/HUDI-4271
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: flink
>            Reporter: HunterHunter
>            Priority: Major
>             Fix For: 0.12.0
>
>
> {code:java}
> //代码占位符
> {code}
> **When debugging, it is found that the next commit will throw an exception after the clean is completed**
> I found that `HoodieTableFileSystemView.partitionToFileGroupsMap` lost the last `instant commit` fileGoup infomation.
> (I  execute `hoodieTable.getHoodieView().reset()` after Throw Exception,and its working after retry `getLatestBaseFile`)
> hudi table config:
> {code:java}
>             "'table.type' = 'COPY_ON_WRITE',\n" +
>                 "'hoodie.parquet.small.file.limit' = '20', \n" +
>                 "'write.operation' = 'insert', \n" +
>                 "'write.insert.cluster' = 'true', \n" +
>                 "'hoodie.datasource.write.hive_style_partitioning' = 'true',\n" 
>                 "'write.task.max.size' = '4096', \n" +
>                 "'write.merge.max_memory'= '2048',\n" +
>                 "'write.precombine' = 'true',\n" +
>                 "'write.tasks' = '1',\n" +
>                 "'write.bucket_assign.tasks' = '1',\n" +
>                 "'hive_sync.skip_ro_suffix' = 'true',\n" +
>                 "'write.ignore.failed' = 'true',\n" +
>                 "'clean.async.enabled' = 'true',\n" +
>                 "'clean.retain_commits' = '6' \n" + {code}
> {code:java}
> The determining factor is
> 'hoodie.parquet.small.file.limit' = '20' -- Trigger new file generation
>  and 
>  'clean.async.enabled' = 'true' -- Trigger async clean
> 'clean.retain_commits' = '6'  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)