You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/29 12:44:34 UTC

[GitHub] [hudi] hj2016 commented on a change in pull request #2188: [HUDI-1347]fix Hbase index partition changes cause data duplication p…

hj2016 commented on a change in pull request #2188:
URL: https://github.com/apache/hudi/pull/2188#discussion_r514229034



##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/hbase/SparkHoodieHBaseIndex.java
##########
@@ -480,6 +486,61 @@ private Integer getNumRegionServersAliveForTable() {
   @Override
   public boolean rollbackCommit(String instantTime) {
     // Rollback in HbaseIndex is managed via method {@link #checkIfValidCommit()}
+    synchronized (SparkHoodieHBaseIndex.class) {

Review comment:
       Of course, the checkIfValidCommit method will detect whether the index commitTime is a valid index, and it has no effect on writing data after rollback. Here, it is possible to write hbase index data incorrectly without deleting the last commit failure. But what I consider is to ensure that the data and the content of the index are consistent. Is it possible to add a configuration here for users to choose? Or is it better not to delete?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org