You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/18 23:22:13 UTC

[GitHub] [hudi] nsivabalan edited a comment on issue #3394: [SUPPORT] Question on hudi's default behaviour for UPSERT

nsivabalan edited a comment on issue #3394:
URL: https://github.com/apache/hudi/issues/3394#issuecomment-895408837


   There are some  nuances here. Ignoring the global, different partitions for now. Just consider how to reconcile two records in general. (in other words, there is only one partition and if a an update record is written to this partition where the record already exists in storage)
   
   I guess you know what preCombine is used for (which is used to combine two records within same incoming batch of write). 
   But to reconcile an incoming record with one already on storage, Hudi relies on HoodieRecordPayload.combineAndGetUpdateValue(). 
   
   Most commonly used payload impl is OverwriteWithLatestAvroPayload. So, this will always choose the latest incoming record over whats in storage. 
   
   But recently we also added another payload impl called DefaultHoodieRecordPayload. This  payload will honor preCombine field while reconciling an incoming record with whats in storage using the preCombine field value(within combineAndGetUpdateValue()). 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org