You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/07/30 21:54:12 UTC

[GitHub] [hudi] umehrot2 commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

umehrot2 commented on a change in pull request #1870:
URL: https://github.com/apache/hudi/pull/1870#discussion_r463291857



##########
File path: hudi-client/src/test/java/org/apache/hudi/table/TestCleaner.java
##########
@@ -885,6 +888,109 @@ private void testKeepLatestCommits(boolean simulateFailureRetry, boolean enableI
         file2P0C1));
   }
 
+  @Test
+  public void testBootstrapSourceFileCleanWithKeepLatestFileVersions() throws IOException {
+    testBootstrapSourceFileClean(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS);
+  }
+
+  @Test
+  public void testBootstrapSourceFileCleanWithKeepLatestCommits() throws IOException {
+    testBootstrapSourceFileClean(HoodieCleaningPolicy.KEEP_LATEST_COMMITS);
+  }
+
+  /**
+   * Test HoodieTable.clean() with Bootstrap source file clean enable.
+   */
+  @Test
+  private void testBootstrapSourceFileClean(HoodieCleaningPolicy cleaningPolicy) throws IOException {
+    HoodieWriteConfig config = HoodieWriteConfig.newBuilder().withPath(basePath).withAssumeDatePartitioning(true)
+        .withCompactionConfig(HoodieCompactionConfig.newBuilder()
+            .withCleanBootstrapSourceFileEnabled(true)
+            .withCleanerPolicy(cleaningPolicy).retainCommits(1).retainFileVersions(2).build())
+        .build();

Review comment:
       I see some private functions in this class to test cleaning based on commits and file versions, that are re-used by other tests. Is it possible to re-use some of these already created ones. May be introduce another boolean flag for bootstrap which can do additional creation of source files and later check if they are cleaned up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org