You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/12/03 08:35:22 UTC

[GitHub] [incubator-hudi] bvaradar commented on issue #1062: [HUDI-294] Delete Paths written in Cleaner plan needs to be relative to partition-path

bvaradar commented on issue #1062: [HUDI-294] Delete Paths written in Cleaner plan needs to be relative to partition-path
URL: https://github.com/apache/incubator-hudi/pull/1062#issuecomment-561057985
 
 
   > > Thanks @leesf : This is a good starting point. You would need to make further changes in
   > > 
   > > 1. HoodieCleanHelper.getDeletePaths need to return only relative paths.
   > > 2. HoodieCopyOnWriteTable.scheduleClean to return only relative paths.
   > > 3. (1) and (2) will ensure HoodieCleanClient.scheduleClean will store cleaner plans with relative paths.
   > > 4. IncrementalTimelineSyncFileSystemView.addCleanInstant needs to handle relative partition paths
   > 
   > Hi @bvaradar . Currently the `HoodieCleanHelper.getDeletePaths` (point 1) and `HoodieCopyOnWriteTable.scheduleClean` (point 2) return relative path already. Also the currently `IncrementalTimelineSyncFileSystemView.addCleanInstant` only handles relative path. Please correct me if I am wrong.
   > 
   > And I addressed other comments.
   
   @leesf : Sorry, my bad regarding scheduleClean comments. I confused HoodieCleanerPlan with HoodieCleanMetadata. HoodieCleanerPlan is newly added as part of 0.5.1 and stores relative paths and do not require migration. HoodieCleanMetadata though requires migration. I looked at  IncrementalTimelineSyncFileSystemView.addCleanInstant and believe this would need more handling. This method calls a common method : IncrementalTimelineSyncFileSystemView.removeFileSlicesForPartition which is shared by rollback/restore code and assumes full path. 
   
   ```
    FileStatus[] statuses = paths.stream().map(p -> {
           FileStatus status = new FileStatus();
           status.setPath(new Path(p));
           return status;
         }).toArray(FileStatus[]::new);
   ```
   To make it easy, you might need to create absolute paths in addCleanInstant before calling removeFileSlicesForPartition. Can you check this part of the code to see if it makes sense.
   
   Thanks for doing this change.
   
   Balaji.V

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services