You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/28 04:27:36 UTC

[GitHub] [iceberg] jackye1995 commented on pull request #2841: API: add an action API for rewrite deletes

jackye1995 commented on pull request #2841:
URL: https://github.com/apache/iceberg/pull/2841#issuecomment-887999783


   Thank you! I reviewed the PR. I will paste my major comment also here:
   
   After reading `ConvertEqDeletesStrategy.rewriteDeletes`, I think we can make the `RewriteDeleteStrategy` interface closer to `RewriteStrategy` interface. What we have there is basically the equivalent of `planFileGroups` plus `rewriteFiles` in `RewriteStrategy`. So I would propose we have the following methods in `RewriteDeleteStrategy` to be more aligned:
   
   ```
   Iterable<DeleteFile> selectDeletesToRewrite(Iterable<FileScanTask> dataFiles);
   
   Iterable<List<FileScanTask>> planDeleteGroups(Iterable<DeleteFile> deleteFiles);
   
   Set<DeleteFile> rewriteDeletes(List<DeleteFile> deleteFilesToRewrite);
   ```
   
   And we can get the partition `StructLike` directly from the list of scan tasks instead of passing it through the task pair in `EqualityDeleteRewriter`. In this way, we can also enable partial progress for commits.
   
   Any thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org