You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/01/30 18:00:00 UTC

[jira] [Commented] (HUDI-3302) Re-evaluate handling of LogBlock appends when Compaction is pending

    [ https://issues.apache.org/jira/browse/HUDI-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484404#comment-17484404 ] 

sivabalan narayanan commented on HUDI-3302:
-------------------------------------------

[~alexey.kudinkin] : may I know if there is any correctness issue or its about layering and abstractions. If correctness issue, wanted to target for 0.11. If not, can you tag w/ 0.12 (fix version)

 

> Re-evaluate handling of LogBlock appends when Compaction is pending
> -------------------------------------------------------------------
>
>                 Key: HUDI-3302
>                 URL: https://issues.apache.org/jira/browse/HUDI-3302
>             Project: Apache Hudi
>          Issue Type: Task
>            Reporter: Alexey Kudinkin
>            Priority: Major
>
> Currently, when (async) Compaction for particular File Group has been scheduled but not yet completed, if writer will try to append additional Log Blocks to the same file-group following will occur:
>  # FileSystemView (when fetched), will check whether any compaction is pending and if it's it will inject "phantom" (ie non-existent) log-file into the existing FileSlice, which will have the same FileGroup name, but will bear instant of the scheduled Compaction commit (on the timeline) in its name (as opposed to the instant of the base-file)
>  # Writer will pick up such log-file as the latest
>  # Writer will write into such "phantom" log-file
> [REF: https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java#L199|https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java#L199]
>  
> This posses following problems: 
>  * Reader now has to be aware of such handling and therefore always include pending compaction instants into its timeline when fetching the FileSystemView, as otherwise it will miss newly added log-files.
>  * This pushes the decision-making point of where writes should be channeled down into FileSystemView, which is clearly alien to its scope of responsibilities.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)