You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/04 12:07:47 UTC

[GitHub] [iceberg] findepi commented on a diff in pull request #6091: Spark-3.3: Handle statistics file clean up from expireSnapshots action/procedure

findepi commented on code in PR #6091:
URL: https://github.com/apache/iceberg/pull/6091#discussion_r1013950242


##########
core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java:
##########
@@ -1234,6 +1245,40 @@ public void testMultipleRefsAndCleanExpiredFilesFailsForIncrementalCleanup() {
                 .commit());
   }
 
+  @Test
+  public void testExpireWithStatisticsFiles() throws URISyntaxException, IOException {
+    table.newAppend().appendFile(FILE_A).commit();
+    File statsFileLocation1 = statsFileLocation(table);
+    StatisticsFile statisticsFile1 = writeStatsFileForCurrentSnapshot(table, statsFileLocation1);
+    Assert.assertEquals(
+        "Must match the latest snapshot",
+        table.currentSnapshot().snapshotId(),
+        statisticsFile1.snapshotId());
+
+    table.newAppend().appendFile(FILE_B).commit();
+    File statsFileLocation2 = statsFileLocation(table);
+    StatisticsFile statisticsFile2 = writeStatsFileForCurrentSnapshot(table, statsFileLocation2);
+    Assert.assertEquals(
+        "Must match the latest snapshot",
+        table.currentSnapshot().snapshotId(),
+        statisticsFile2.snapshotId());
+
+    Assert.assertEquals("Should have 2 statistics file", 2, table.statisticsFiles().size());
+
+    table.updateProperties().set(TableProperties.MAX_SNAPSHOT_AGE_MS, "1").commit();
+
+    removeSnapshots(table).commit();
+
+    Assert.assertEquals("Should keep 1 snapshot", 1, Iterables.size(table.snapshots()));
+    Assertions.assertThat(table.statisticsFiles())
+        .hasSize(1)
+        .extracting(StatisticsFile::snapshotId)
+        .as("Should contain only the statistics file of snapshot2")
+        .isEqualTo(Lists.newArrayList(statisticsFile2.snapshotId()));
+    Assertions.assertThat(statsFileLocation1.exists()).isFalse();
+    Assertions.assertThat(statsFileLocation2.exists()).isTrue();

Review Comment:
   Can you please add a test with stats file shared between snapshots?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org