You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org> on 2023/04/24 04:09:47 UTC

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1174758600


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java:
##########
@@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMetadata metadata,
   protected HoodieWriteMetadata<JavaRDD<WriteStatus>> compact(String compactionInstantTime, boolean shouldComplete) {
     HoodieSparkTable<T> table = HoodieSparkTable.create(config, context);
     preWrite(compactionInstantTime, WriteOperationType.COMPACT, table.getMetaClient());
-    return tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    autoCleanOnCommit();
+    return compactionMetadata;

Review Comment:
   Move the clean operation to offline && Add UT.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org