You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/28 05:22:16 UTC

[GitHub] [hudi] yihua commented on issue #5442: HUDI does not deduplicate within the same partition

yihua commented on issue #5442:
URL: https://github.com/apache/hudi/issues/5442#issuecomment-1111758104

   @mandar-mw In order to deduplicate records within and across commits for INSERT operation, you need to see both of the following configs to be `true` (they are `false` by default): [`hoodie.datasource.write.insert.drop.duplicates`](https://hudi.apache.org/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) and [`hoodie.combine.before.insert`](https://hudi.apache.org/docs/configurations#hoodiecombinebeforeinsert).  Let me know if this helps to solve your problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org