You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/23 18:46:42 UTC

[GitHub] [iceberg] szehon-ho opened a new pull request #4211: Add test for Table::expireSnapshot API removing delete files

szehon-ho opened a new pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211


   From this comment: https://github.com/apache/iceberg/pull/4182#issuecomment-1047472013
   
   There are so far tests for expire-action and remove-orphans Spark Actions for V2 Delete Files, but not yet the Table API expire-snapshots.  This adds that test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r815209143



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,66 @@ public void testExpireWithDefaultSnapshotAge() {
         Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
         deletedFiles);
   }
+
+  @Test
+  public void testExpireWithDeleteFiles() {
+    Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+    table.updateProperties()

Review comment:
       Do we need to change the age here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#issuecomment-1057476165


   Thanks @szehon-ho ! This is some great coverage to have!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r815214292



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,66 @@ public void testExpireWithDefaultSnapshotAge() {
         Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
         deletedFiles);
   }
+
+  @Test
+  public void testExpireWithDeleteFiles() {
+    Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+    table.updateProperties()

Review comment:
       Sorry must have slipped in from copy and paste, nice catch, done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r818157745



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,63 @@ public void testExpireWithDefaultSnapshotAge() {
         Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
         deletedFiles);
   }
+
+  @Test
+  public void testExpireWithDeleteFiles() {
+    Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+
+    // Data Manifest => File_A
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+    Snapshot firstSnapshot = table.currentSnapshot();
+
+    // Data Manifest => FILE_A
+    // Delete Manifest => FILE_A_DELETES
+    table.newRowDelta()
+        .addDeletes(FILE_A_DELETES)
+        .commit();
+    Snapshot secondSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should have 1 data manifests", 1, secondSnapshot.dataManifests().size());
+    Assert.assertEquals("Should have 1 delete manifest", 1, secondSnapshot.deleteManifests().size());
+
+    // FILE_A and FILE_A_DELETES move into "DELETED" state
+    table.newRewrite()
+        .rewriteFiles(
+            ImmutableSet.of(FILE_A), ImmutableSet.of(FILE_A_DELETES), // deleted
+            ImmutableSet.of(FILE_B), ImmutableSet.of(FILE_B_DELETES)) // added
+        .validateFromSnapshot(secondSnapshot.snapshotId())
+        .commit();
+    Snapshot thirdSnapshot = table.currentSnapshot();
+    Set<ManifestFile> manifestOfDeletedFiles = thirdSnapshot.allManifests().stream().filter(
+        ManifestFile::hasDeletedFiles).collect(Collectors.toSet());
+    Assert.assertEquals("Should have two manifest of deleted files", 2,

Review comment:
       nit: manifests of deleted files




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#issuecomment-1051327965


   @RussellSpitzer  could I ping you for review ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r818163634



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,63 @@ public void testExpireWithDefaultSnapshotAge() {
         Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
         deletedFiles);
   }
+
+  @Test
+  public void testExpireWithDeleteFiles() {
+    Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+
+    // Data Manifest => File_A
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+    Snapshot firstSnapshot = table.currentSnapshot();
+
+    // Data Manifest => FILE_A
+    // Delete Manifest => FILE_A_DELETES
+    table.newRowDelta()
+        .addDeletes(FILE_A_DELETES)
+        .commit();
+    Snapshot secondSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should have 1 data manifests", 1, secondSnapshot.dataManifests().size());
+    Assert.assertEquals("Should have 1 delete manifest", 1, secondSnapshot.deleteManifests().size());
+
+    // FILE_A and FILE_A_DELETES move into "DELETED" state
+    table.newRewrite()
+        .rewriteFiles(
+            ImmutableSet.of(FILE_A), ImmutableSet.of(FILE_A_DELETES), // deleted
+            ImmutableSet.of(FILE_B), ImmutableSet.of(FILE_B_DELETES)) // added
+        .validateFromSnapshot(secondSnapshot.snapshotId())
+        .commit();
+    Snapshot thirdSnapshot = table.currentSnapshot();
+    Set<ManifestFile> manifestOfDeletedFiles = thirdSnapshot.allManifests().stream().filter(
+        ManifestFile::hasDeletedFiles).collect(Collectors.toSet());
+    Assert.assertEquals("Should have two manifest of deleted files", 2,

Review comment:
       Done, also fixed another typo the other way on 1165




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer merged pull request #4211: Add test for Table::expireSnapshot API removing delete files

Posted by GitBox <gi...@apache.org>.
RussellSpitzer merged pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org