You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "nsivabalan (via GitHub)" <gi...@apache.org> on 2023/04/01 04:23:18 UTC

[GitHub] [hudi] nsivabalan commented on a diff in pull request #8344: [HUDI-5968] Fix global index duplicate when update partition

nsivabalan commented on code in PR #8344:
URL: https://github.com/apache/hudi/pull/8344#discussion_r1155054255


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/simple/HoodieGlobalSimpleIndex.java:
##########
@@ -135,8 +135,8 @@ private <R> HoodieData<HoodieRecord<R>> getTaggedRecords(
               HoodieRecord<R> deleteRecord = new HoodieAvroRecord(new HoodieKey(inputRecord.getRecordKey(), partitionPath), new EmptyHoodieRecordPayload());
               deleteRecord.setCurrentLocation(location);
               deleteRecord.seal();
-              // Tag the incoming record for inserting to the new partition
-              HoodieRecord<R> insertRecord = (HoodieRecord<R>) HoodieIndexUtils.getTaggedRecord(inputRecord, Option.empty());
+              // Tag the incoming record for inserting to the new partition; left unsealed for marking as dedup later
+              HoodieRecord<R> insertRecord = (HoodieRecord<R>) HoodieIndexUtils.getUnsealedTaggedRecord(inputRecord, Option.empty());

Review Comment:
   I feel, we are retrofitting the sealing property to meet our goals. I feel, we should just map the record to a pair(record, isUpdate(boolean)) within flatMap and then use that property instead of seal. I don't want the sealing property to be used for external filtering purposes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org