You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/05/17 06:03:56 UTC

[GitHub] [hudi] ChandraNarreddy commented on issue #2951: Can Hudi pull data for incremental queries with the latest snapshot alone?

ChandraNarreddy commented on issue #2951:
URL: https://github.com/apache/hudi/issues/2951#issuecomment-842018705


   @n3nash, thanks for the quick reply. A follow up on the incremental pull, how does Hudi keep track of which records were changed and which ones were inserted and deleted from a particular checkpoint among others. Does Hudi maintain the state of each record from previous commits even in the latest snapshot (which I thought was the sole purpose of the file versions). I guess what I am trying to get my head around is how Hudi fetches changes (updates, inserts and deletes) from a particular commit X to the latest snapshot when all the states (file versions?) from X to the current are no longer around. 
   
   A follow up on hoodie.cleaner.fileversions.retained, how is this setting different from hoodie.cleaner.commits.retained? Are they independent of each other?
   
   Thanks in advance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org