You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "yanxiang (Jira)" <ji...@apache.org> on 2022/05/18 08:53:00 UTC

[jira] [Updated] (HUDI-4119) the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi

     [ https://issues.apache.org/jira/browse/HUDI-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

yanxiang updated HUDI-4119:
---------------------------
    Status: In Progress  (was: Open)

> the first read result is incorrect  when Flink upsert- Kafka connector is used in  HUDi 
> ----------------------------------------------------------------------------------------
>
>                 Key: HUDI-4119
>                 URL: https://issues.apache.org/jira/browse/HUDI-4119
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: yanxiang
>            Priority: Major
>             Fix For: 0.11.0
>
>
>  the first read result is incorrect  when Flink upsert- Kafka connector is used in  HUDi .
>  
>  ETL  path: flink upsert-kafka connector -> hudi table (MOR table,query by stream)
>  
> Here is the case:
>  
> 1. the first time: write two records  with the same primary key into kafka, and  insert them into hudi table. the query result should be three records: +I first record, -U first record, +U second record; But the first time I query hudi table, I found that all the data operation were +I: +I first record,+I first record and +I second record, and there was no update operation; 
>  Three times +I has affected hudi's subsequent ETL process-the data of  groupBy is inaccurate; 
> 2. Second time: Exit the first query, restart the query job of hudi table, and the query results are normal: +I first data, -U first data, +U second data.
>  
> Reason:
> Reason:There is a bug in the program. When no data log file is generated, the Schema does not include the column' _ hoodie _ operation'.Please refer to the following link for details:
> [https://www.jianshu.com/p/29f9ec5e606e]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)