You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "szehon-ho (via GitHub)" <gi...@apache.org> on 2023/05/20 00:21:05 UTC

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7651: Core: Compacted position delete files should use the max data sequence number of source files

szehon-ho commented on code in PR #7651:
URL: https://github.com/apache/iceberg/pull/7651#discussion_r1199507845


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -246,12 +246,21 @@ protected void add(DataFile file) {
 
   /** Add a delete file to the new snapshot. */
   protected void add(DeleteFile file) {
-    Preconditions.checkNotNull(file, "Invalid delete file: null");

Review Comment:
   Yea initially tried to move it , but did it incomplete.  Added back to original location.



##########
core/src/main/java/org/apache/iceberg/actions/RewritePositionDeletesCommitManager.java:
##########
@@ -55,12 +55,13 @@ public void commit(Set<RewritePositionDeletesGroup> fileGroups) {
     RewriteFiles rewriteFiles = table.newRewrite().validateFromSnapshot(startingSnapshotId);
 
     for (RewritePositionDeletesGroup group : fileGroups) {
+      long maxSequenceNumber = group.maxRewrittenDataSequenceNumber();

Review Comment:
   Done, use the call directly



##########
api/src/main/java/org/apache/iceberg/RewriteFiles.java:
##########
@@ -97,6 +97,22 @@ default RewriteFiles addFile(DeleteFile deleteFile) {
         this.getClass().getName() + " does not implement addFile");
   }
 
+  /**
+   * Add a new delete file with the given data sequence number.
+   *
+   * <p>This rewrite operation may change the size or layout of the delete files. When applicable,
+   * it is also recommended to discard delete records for files that are no longer part of the table
+   * state. However, the set of applicable delete records must never change.

Review Comment:
   Added a paragraph below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org