You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/20 21:30:56 UTC

[GitHub] [iceberg] rdblue opened a new pull request #1361: Add more tests for RemoveSnapshots

rdblue opened a new pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361


   This adds some basic tests for `RemoveSnapshots`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#issuecomment-678445556


   @aokolnychyi, I agree. I think the problem is in the `ParallelIterator` since this failure happened in job planning. I took a long look at it last night and didn't see what would cause it, but more eyes on that part of the code would help.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#issuecomment-678440127


   It starts to worry me that we have test failures from time to time with JDK11. I've seen failures in at least 3 different suites and all of them look like a deviation from what was expected.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#issuecomment-678469900


   Thanks for reviewing, @RussellSpitzer & @aokolnychyi!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#issuecomment-678394614


   Looks like the test failure is the flaky Java 11 test: https://travis-ci.org/github/apache/iceberg/jobs/719767149#L2698-L2705
   
   I'm not sure what's going on there, but I've seen that test failing in other runs as well.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#discussion_r474290437



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -50,6 +51,257 @@ public TestRemoveSnapshots(int formatVersion) {
     super(formatVersion);
   }
 
+  private long waitUntilAfter(long timestampMillis) {
+    long t = System.currentTimeMillis();
+    while (t <= timestampMillis) {
+      t = System.currentTimeMillis();
+    }
+    return t;
+  }
+
+  @Test
+  public void testExpireOlderThan() {
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+
+    Snapshot firstSnapshot = table.currentSnapshot();
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    table.newAppend()
+        .appendFile(FILE_B)
+        .commit();
+
+    long snapshotId = table.currentSnapshot().snapshotId();
+
+    long tAfterCommits = waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    Set<String> deletedFiles = Sets.newHashSet();
+
+    table.expireSnapshots()
+        .expireOlderThan(tAfterCommits)
+        .deleteWith(deletedFiles::add)
+        .commit();
+
+    Assert.assertEquals("Expire should not change current snapshot", snapshotId, table.currentSnapshot().snapshotId());
+    Assert.assertNull("Expire should remove the oldest snapshot", table.snapshot(firstSnapshot.snapshotId()));
+    Assert.assertEquals("Should not remove only the expired manifest list location",

Review comment:
       Should only




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#discussion_r474290437



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -50,6 +51,257 @@ public TestRemoveSnapshots(int formatVersion) {
     super(formatVersion);
   }
 
+  private long waitUntilAfter(long timestampMillis) {
+    long t = System.currentTimeMillis();
+    while (t <= timestampMillis) {
+      t = System.currentTimeMillis();
+    }
+    return t;
+  }
+
+  @Test
+  public void testExpireOlderThan() {
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+
+    Snapshot firstSnapshot = table.currentSnapshot();
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    table.newAppend()
+        .appendFile(FILE_B)
+        .commit();
+
+    long snapshotId = table.currentSnapshot().snapshotId();
+
+    long tAfterCommits = waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    Set<String> deletedFiles = Sets.newHashSet();
+
+    table.expireSnapshots()
+        .expireOlderThan(tAfterCommits)
+        .deleteWith(deletedFiles::add)
+        .commit();
+
+    Assert.assertEquals("Expire should not change current snapshot", snapshotId, table.currentSnapshot().snapshotId());
+    Assert.assertNull("Expire should remove the oldest snapshot", table.snapshot(firstSnapshot.snapshotId()));
+    Assert.assertEquals("Should not remove only the expired manifest list location",

Review comment:
       Should only remove the expired manifest list location




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#issuecomment-677920750


   Probably all of these should be added to the action test suite as well


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1361: Add more tests for RemoveSnapshots

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1361:
URL: https://github.com/apache/iceberg/pull/1361#discussion_r474291294



##########
File path: core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java
##########
@@ -50,6 +51,257 @@ public TestRemoveSnapshots(int formatVersion) {
     super(formatVersion);
   }
 
+  private long waitUntilAfter(long timestampMillis) {
+    long t = System.currentTimeMillis();
+    while (t <= timestampMillis) {
+      t = System.currentTimeMillis();
+    }
+    return t;
+  }
+
+  @Test
+  public void testExpireOlderThan() {
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+
+    Snapshot firstSnapshot = table.currentSnapshot();
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    table.newAppend()
+        .appendFile(FILE_B)
+        .commit();
+
+    long snapshotId = table.currentSnapshot().snapshotId();
+
+    long tAfterCommits = waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    Set<String> deletedFiles = Sets.newHashSet();
+
+    table.expireSnapshots()
+        .expireOlderThan(tAfterCommits)
+        .deleteWith(deletedFiles::add)
+        .commit();
+
+    Assert.assertEquals("Expire should not change current snapshot", snapshotId, table.currentSnapshot().snapshotId());
+    Assert.assertNull("Expire should remove the oldest snapshot", table.snapshot(firstSnapshot.snapshotId()));
+    Assert.assertEquals("Should not remove only the expired manifest list location",
+        Sets.newHashSet(firstSnapshot.manifestListLocation()), deletedFiles);
+  }
+
+  @Test
+  public void testExpireOlderThanWithDelete() {
+    table.newAppend()
+        .appendFile(FILE_A)
+        .commit();
+
+    Snapshot firstSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should create one manifest",
+        1, firstSnapshot.allManifests().size());
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    table.newDelete()
+        .deleteFile(FILE_A)
+        .commit();
+
+    Snapshot secondSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should create replace manifest with a rewritten manifest",
+        1, secondSnapshot.allManifests().size());
+
+    table.newAppend()
+        .appendFile(FILE_B)
+        .commit();
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    long snapshotId = table.currentSnapshot().snapshotId();
+
+    long tAfterCommits = waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    Set<String> deletedFiles = Sets.newHashSet();
+
+    table.expireSnapshots()
+        .expireOlderThan(tAfterCommits)
+        .deleteWith(deletedFiles::add)
+        .commit();
+
+    Assert.assertEquals("Expire should not change current snapshot", snapshotId, table.currentSnapshot().snapshotId());
+    Assert.assertNull("Expire should remove the oldest snapshot", table.snapshot(firstSnapshot.snapshotId()));
+    Assert.assertNull("Expire should remove the second oldest snapshot", table.snapshot(secondSnapshot.snapshotId()));
+
+    Assert.assertEquals("Should remove expired manifest lists and deleted data file",
+        Sets.newHashSet(
+            firstSnapshot.manifestListLocation(), // snapshot expired
+            firstSnapshot.allManifests().get(0).path(), // manifest was rewritten for delete
+            secondSnapshot.manifestListLocation(), // snapshot expired
+            secondSnapshot.allManifests().get(0).path(), // manifest contained only deletes, was dropped
+            FILE_A.path()), // deleted
+        deletedFiles);
+  }
+
+  @Test
+  public void testExpireOlderThanWithDeleteInMergedManifests() {
+    // merge every commit
+    table.updateProperties()
+        .set(TableProperties.MANIFEST_MIN_MERGE_COUNT, "0")
+        .commit();
+
+    table.newAppend()
+        .appendFile(FILE_A)
+        .appendFile(FILE_B)
+        .commit();
+
+    Snapshot firstSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should create one manifest",
+        1, firstSnapshot.allManifests().size());
+
+    waitUntilAfter(table.currentSnapshot().timestampMillis());
+
+    table.newDelete()
+        .deleteFile(FILE_A) // FILE_B is still in the dataset
+        .commit();
+
+    Snapshot secondSnapshot = table.currentSnapshot();
+    Assert.assertEquals("Should create replace manifest with a rewritten manifest",

Review comment:
       Should replace manifest with a rewritten manifest




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org