You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2023/04/13 17:16:00 UTC
[jira] [Closed] (HUDI-5990) Incremental queries on MOR sometimes miss data
[ https://issues.apache.org/jira/browse/HUDI-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit closed HUDI-5990.
-----------------------------
Resolution: Fixed
> Incremental queries on MOR sometimes miss data
> ----------------------------------------------
>
> Key: HUDI-5990
> URL: https://issues.apache.org/jira/browse/HUDI-5990
> Project: Apache Hudi
> Issue Type: Bug
> Components: spark-sql
> Affects Versions: 0.12.2, 0.13.0
> Reporter: ruofan
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.14.0
>
>
> env: hudi-0.12.2 spark-3.2.0
> Currently,we have a hudi timeline and data files.
> {code:java}
> -rw-r--r-- 1 rfyu rfyu 1.5K 3月 26 09:58 20230326095758155.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:57 20230326095758155.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:57 20230326095758155.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.6K 3月 26 09:58 20230326095810406.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095810406.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095810406.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.7K 3月 26 09:58 20230326095811072.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095811072.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095811072.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.7K 3月 26 09:58 20230326095820974.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095820974.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095820974.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.8K 3月 26 09:58 20230326095830980.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095830980.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095830980.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.8K 3月 26 09:58 20230326095840978.compaction.requested
> -rw-r--r-- 1 rfyu rfyu 1.5K 3月 26 09:58 20230326095841125.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095841125.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095841125.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.6K 3月 26 09:59 20230326095850994.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095850994.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:58 20230326095850994.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.7K 3月 26 09:59 20230326095900988.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095900988.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095900988.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.7K 3月 26 09:59 20230326095910983.deltacommit
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095910983.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095910983.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095920986.deltacommit.inflight
> -rw-r--r-- 1 rfyu rfyu 0 3月 26 09:59 20230326095920986.deltacommit.requested
> -rw-r--r-- 1 rfyu rfyu 1.5K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095758155.log.1_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095758155.log.2_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095758155.log.3_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095758155.log.4_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095758155.log.5_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:58 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095840978.log.1_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:59 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095840978.log.2_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:59 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095840978.log.3_0-1-0
> -rw-r--r-- 1 rfyu rfyu 3.0K 3月 26 09:59 .b9f3a322-b0fe-4f70-8ad8-aa2664be957c_20230326095840978.log.4_0-1-0 {code}
> We use spark to incrementally query this hudi table. Data maybe go missing due to the incremental range contains an incomplete compaction plan.
> There is an example of incremental query.Normally, from begin_instance_time to end_instance_time, 6 commits should have been found, but only 3 were found.
> {code:java}
> sql:
> call copy_to_table(table=>'hudi_table',new_table=>'incremental_table',query_type=>'incremental',begin_instance_time=>'20230326095810406',end_instance_time=>'20230326095900988');
> select _hoodie_commit_time,count(*) from incremental_table group by _hoodie_commit_time order by _hoodie_commit_time desc;
> actual result:
> +-------------------+--------+
> |_hoodie_commit_time|count(1)|
> +-------------------+--------+
> |20230326095830980 |10 |
> |20230326095820974 |10 |
> |20230326095811072 |10 |
> +-------------------+--------+
> expected result:
> +-------------------+--------+
> |_hoodie_commit_time|count(1)|
> +-------------------+--------+
> |20230326095830980 |10 |
> |20230326095820974 |10 |
> |20230326095811072 |10 |
> |20230326095841125 |10 |
> |20230326095850994 |10 |
> |20230326095900988 |10 | {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)