You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "danny0405 (via GitHub)" <gi...@apache.org> on 2023/02/11 06:17:16 UTC

[GitHub] [hudi] danny0405 commented on a diff in pull request #6121: [HUDI-4406] Support Flink compaction/clustering write error resolvement to avoid data loss

danny0405 commented on code in PR #6121:
URL: https://github.com/apache/hudi/pull/6121#discussion_r1103540589


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringCommitSink.java:
##########
@@ -119,7 +119,16 @@ private void commitIfNecessary(String instant, List<ClusteringCommitEvent> event
       return;
     }
 
-    if (events.stream().anyMatch(ClusteringCommitEvent::isFailed)) {
+    // here we should take the write errors under consideration
+    // as some write errors might cause data loss when clustering
+    List<WriteStatus> statuses = events.stream()

Review Comment:
   Things are a little different for when `events.stream().anyMatch(ClusteringCommitEvent::isFailed)`, the `isFailed` flag always indicates some errors that are related with the service execution, not the data record quality, we should always try to rollback in this case, not just throws.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org