You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/10/31 19:57:00 UTC

[jira] [Closed] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

     [ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan closed HUDI-4893.
-------------------------------------
    Fix Version/s: 0.12.2
                       (was: 0.13.0)
       Resolution: Fixed

> More than 1 splits are created for a single log file for MOR table
> ------------------------------------------------------------------
>
>                 Key: HUDI-4893
>                 URL: https://issues.apache.org/jira/browse/HUDI-4893
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: reader-core
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 0.12.2
>
>
> While debugging a flaky test, realized that we are generating more than 1 split for one log file itself. Root caused it to isSpllitable() that returns true for HoodieRealTimePath. 
>  
> [https://github.com/apache/hudi/blob/6dbe2960f2eaf0408dc0ef544991cad0190050a9/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java#L91]
>  
> I made a quick fix locally and verified that only one split is generated per log file. 
>  
> {code:java}
> git diff hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
> diff --git a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
> index bba44d5c66..d09dfdf753 100644
> --- a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
> +++ b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
> @@ -89,7 +89,7 @@ public class HoodieRealtimePath extends Path {
>    }
>  
>    public boolean isSplitable() {
> -    return !toString().isEmpty() && !includeBootstrapFilePath();
> +    return !toString().contains(".log") && !includeBootstrapFilePath();
>    }
>  
>    public PathWithBootstrapFileStatus getPathWithBootstrapFileStatus() { {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)