You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/03 06:08:34 UTC

[GitHub] [hudi] danny0405 commented on a change in pull request #3386: [HUDI-2270] Remove corrupted clean action

danny0405 commented on a change in pull request #3386:
URL: https://github.com/apache/hudi/pull/3386#discussion_r681461078



##########
File path: hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java
##########
@@ -329,6 +337,30 @@ private void initInstant(String instant) {
     }, "initialize instant %s", instant);
   }
 
+  public void removeCorruptedCleanAction() {
+    HoodieTableMetaClient client = writeClient.getHoodieTable().getMetaClient();
+    HoodieTimeline cleanerTimeline = client.getActiveTimeline().getCleanerTimeline();
+
+    executor.execute(() -> {
+      LOG.info("Inspecting clean metadata in timeline for corrupted files");
+      cleanerTimeline.getInstants().forEach(instant -> {
+        try {
+          CleanerUtils.getCleanerPlan(client, instant);
+        } catch (AvroRuntimeException e) {
+          LOG.warn("Corruption found. Trying to remove corrupted clean instant file: " + instant);
+          FSUtils.deleteInstantFile(client.getFs(), client.getMetaPath(), instant);
+        } catch (IOException ioe) {
+          if (ioe.getMessage().contains("Not an Avro data file")) {
+            LOG.warn("Corruption found. Trying to remove corrupted clean instant file: " + instant);

Review comment:
       Move this work to `CleanFunction`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org