You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/23 18:46:42 UTC
[GitHub] [iceberg] szehon-ho opened a new pull request #4211: Add test for Table::expireSnapshot API removing delete files
szehon-ho opened a new pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211
From this comment: https://github.com/apache/iceberg/pull/4182#issuecomment-1047472013
There are so far tests for expire-action and remove-orphans Spark Actions for V2 Delete Files, but not yet the Table API expire-snapshots. This adds that test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r815209143
##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,66 @@ public void testExpireWithDefaultSnapshotAge() {
Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
deletedFiles);
}
+
+ @Test
+ public void testExpireWithDeleteFiles() {
+ Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+ table.updateProperties()
Review comment:
Do we need to change the age here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer commented on pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#issuecomment-1057476165
Thanks @szehon-ho ! This is some great coverage to have!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] szehon-ho commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r815214292
##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,66 @@ public void testExpireWithDefaultSnapshotAge() {
Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
deletedFiles);
}
+
+ @Test
+ public void testExpireWithDeleteFiles() {
+ Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+ table.updateProperties()
Review comment:
Sorry must have slipped in from copy and paste, nice catch, done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r818157745
##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,63 @@ public void testExpireWithDefaultSnapshotAge() {
Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
deletedFiles);
}
+
+ @Test
+ public void testExpireWithDeleteFiles() {
+ Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+
+ // Data Manifest => File_A
+ table.newAppend()
+ .appendFile(FILE_A)
+ .commit();
+ Snapshot firstSnapshot = table.currentSnapshot();
+
+ // Data Manifest => FILE_A
+ // Delete Manifest => FILE_A_DELETES
+ table.newRowDelta()
+ .addDeletes(FILE_A_DELETES)
+ .commit();
+ Snapshot secondSnapshot = table.currentSnapshot();
+ Assert.assertEquals("Should have 1 data manifests", 1, secondSnapshot.dataManifests().size());
+ Assert.assertEquals("Should have 1 delete manifest", 1, secondSnapshot.deleteManifests().size());
+
+ // FILE_A and FILE_A_DELETES move into "DELETED" state
+ table.newRewrite()
+ .rewriteFiles(
+ ImmutableSet.of(FILE_A), ImmutableSet.of(FILE_A_DELETES), // deleted
+ ImmutableSet.of(FILE_B), ImmutableSet.of(FILE_B_DELETES)) // added
+ .validateFromSnapshot(secondSnapshot.snapshotId())
+ .commit();
+ Snapshot thirdSnapshot = table.currentSnapshot();
+ Set<ManifestFile> manifestOfDeletedFiles = thirdSnapshot.allManifests().stream().filter(
+ ManifestFile::hasDeletedFiles).collect(Collectors.toSet());
+ Assert.assertEquals("Should have two manifest of deleted files", 2,
Review comment:
nit: manifests of deleted files
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] szehon-ho commented on pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#issuecomment-1051327965
@RussellSpitzer could I ping you for review ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] szehon-ho commented on a change in pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211#discussion_r818163634
##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -1143,4 +1145,63 @@ public void testExpireWithDefaultSnapshotAge() {
Sets.newHashSet(firstSnapshot.manifestListLocation(), secondSnapshot.manifestListLocation()),
deletedFiles);
}
+
+ @Test
+ public void testExpireWithDeleteFiles() {
+ Assume.assumeTrue("Delete files only supported in V2 spec", formatVersion == 2);
+
+ // Data Manifest => File_A
+ table.newAppend()
+ .appendFile(FILE_A)
+ .commit();
+ Snapshot firstSnapshot = table.currentSnapshot();
+
+ // Data Manifest => FILE_A
+ // Delete Manifest => FILE_A_DELETES
+ table.newRowDelta()
+ .addDeletes(FILE_A_DELETES)
+ .commit();
+ Snapshot secondSnapshot = table.currentSnapshot();
+ Assert.assertEquals("Should have 1 data manifests", 1, secondSnapshot.dataManifests().size());
+ Assert.assertEquals("Should have 1 delete manifest", 1, secondSnapshot.deleteManifests().size());
+
+ // FILE_A and FILE_A_DELETES move into "DELETED" state
+ table.newRewrite()
+ .rewriteFiles(
+ ImmutableSet.of(FILE_A), ImmutableSet.of(FILE_A_DELETES), // deleted
+ ImmutableSet.of(FILE_B), ImmutableSet.of(FILE_B_DELETES)) // added
+ .validateFromSnapshot(secondSnapshot.snapshotId())
+ .commit();
+ Snapshot thirdSnapshot = table.currentSnapshot();
+ Set<ManifestFile> manifestOfDeletedFiles = thirdSnapshot.allManifests().stream().filter(
+ ManifestFile::hasDeletedFiles).collect(Collectors.toSet());
+ Assert.assertEquals("Should have two manifest of deleted files", 2,
Review comment:
Done, also fixed another typo the other way on 1165
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer merged pull request #4211: Add test for Table::expireSnapshot API removing delete files
Posted by GitBox <gi...@apache.org>.
RussellSpitzer merged pull request #4211:
URL: https://github.com/apache/iceberg/pull/4211
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org