You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/18 20:10:10 UTC

[GitHub] [hudi] suryaprasanna commented on issue #5223: [SUPPORT] - HUDI clustering - read issues

suryaprasanna commented on issue #5223:
URL: https://github.com/apache/hudi/issues/5223#issuecomment-1101721223

   @sharathkola 
   I do not have much context if at all aws is having any custom implementation on hudi code.
   
   In the spark DAG, the filter block is showing output rows as 2. Does that mean duplicates rows are returned?
   If there are duplicates, then one problem I can think of is, completed replacecommit file not having partitionToReplaceFileIds, so, it maybe considering all 198 files as valid files even though only 2 of them are valid.
   For further investigation, could you share us the contents of 20220404094047.commit and 20220404094203.replacecommit files?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org