You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/10/06 07:09:09 UTC

[GitHub] [incubator-hudi] bvaradar opened a new pull request #942: [WIP] [HUDI-137] Fix state transitions for Hudi cleaning action

bvaradar opened a new pull request #942: [WIP] [HUDI-137] Fix state transitions for Hudi cleaning action
URL: https://github.com/apache/incubator-hudi/pull/942
 
 
   
   Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
   With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
   This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.
   
   Cleaner state transitions is now similar to that of compaction.
   
   1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
   2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
   3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan
   
   There will be followup PRs after this :
   1. HUDI-294 for making cleaner stats use relative paths.
   2. HUDI-137 for similar handling for Rollback
   3. HUDI-80  for incrementalize cleaning
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services