You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "ZiyueGuan (Jira)" <ji...@apache.org> on 2022/01/06 03:38:00 UTC

[jira] [Updated] (HUDI-3026) HoodieAppendhandle may result in duplicate key for hbase index

     [ https://issues.apache.org/jira/browse/HUDI-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ZiyueGuan updated HUDI-3026:
----------------------------
    Description: 
Problem: a same key may occur in two file group when Hbase index is used. These two file group will have same FileID prefix. As Hbase index is global, this is unexpected

How to repro:

We should have a table w/o record sorted in spark. Let's say we have five records with key 1,2,3,4,5 to write. They may be iterated in different order. 

In the first attempt 1, we write three records 5,4,3 to fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the second task attempt (attempt 2), we write four records 1,2,3,4 to  fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish this commit.

When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we also got 5 in fileID_2. Record 5 will appear in two fileGroup.

Reason: Markerfile doesn't reconcile log file as code show in  [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L553.]

And log file is actually not fail-safe.

I'm not sure if [~danny0405] have found this problem too as I find FlinkAppendHandle had been made to always return true. But it was just changed back recently. 

Solution:

We may have a quick fix by making canWrite in HoodieAppendHandle always return true. However, I think there may be a more elegant solution that we use append result to generate compaction plan rather than list log file, in which we will have a more granular control on log block instead of log file. 

  was:
Problem: a same key may occur in two file group when Hbase index is used. These two file group will have same FileID prefix. As Hbase index is global, this is unexpected

How to repro:

We should have a table w/o record sorted in spark. Let's say we have 1,2,3,4,5 records to write. They may be iterated in different order. 

In the first attempt 1, we write 543 to fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the second task attempt (attempt 2), we write 1234 to  fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish this commit.

When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we also got 5 in fileID_2. Record 5 will appear in two fileGroup.

Reason: Markerfile doesn't reconcile log file as code show in  [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L553.]

And log file is actually not fail-safe.

I'm not sure if [~danny0405] have found this problem too as I find FlinkAppendHandle had been made to always return true. But it was just changed back recently. 

Solution:

We may have a quick fix by making canWrite in HoodieAppendHandle always return true. However, I think there may be a more elegant solution that we use append result to generate compaction plan rather than list log file, in which we will have a more granular control on log block instead of log file. 


> HoodieAppendhandle may result in duplicate key for hbase index
> --------------------------------------------------------------
>
>                 Key: HUDI-3026
>                 URL: https://issues.apache.org/jira/browse/HUDI-3026
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: ZiyueGuan
>            Assignee: ZiyueGuan
>            Priority: Major
>              Labels: pull-request-available
>
> Problem: a same key may occur in two file group when Hbase index is used. These two file group will have same FileID prefix. As Hbase index is global, this is unexpected
> How to repro:
> We should have a table w/o record sorted in spark. Let's say we have five records with key 1,2,3,4,5 to write. They may be iterated in different order. 
> In the first attempt 1, we write three records 5,4,3 to fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the second task attempt (attempt 2), we write four records 1,2,3,4 to  fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish this commit.
> When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we also got 5 in fileID_2. Record 5 will appear in two fileGroup.
> Reason: Markerfile doesn't reconcile log file as code show in  [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L553.]
> And log file is actually not fail-safe.
> I'm not sure if [~danny0405] have found this problem too as I find FlinkAppendHandle had been made to always return true. But it was just changed back recently. 
> Solution:
> We may have a quick fix by making canWrite in HoodieAppendHandle always return true. However, I think there may be a more elegant solution that we use append result to generate compaction plan rather than list log file, in which we will have a more granular control on log block instead of log file. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)