You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/07 10:22:56 UTC

[GitHub] [hudi] prashantwason opened a new pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

prashantwason opened a new pull request #3428:
URL: https://github.com/apache/hudi/pull/3428


   ## What is the purpose of the pull request
   
   A failed deltacommit on the metadata table will be automatically rolled back. Assuming the failed commit was "t10", the rollback will happen the next time at "t11". Post rollback, when we try to sync the dataset to the metadata table, we should look for all unsynched instants including t11. Current code ignores t11 since the latest commit timestamp on metadata table is t11 (due to rollback).
   
   ## Brief change log
   
   
   ## Verify this pull request
   
   Unit test has been added
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#discussion_r686940740



##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -857,6 +859,62 @@ public void testMetadataOutOfSync() throws Exception {
     validateMetadata(unsyncedClient);
   }
 
+  /**
+   * Test that failure to perform deltacommit on the metadata table does not lead to missed sync.
+   */
+  @Test
+  public void testMetdataTableCommitFailure() throws Exception {
+    init(HoodieTableType.COPY_ON_WRITE);
+    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+    try (SparkRDDWriteClient client = new SparkRDDWriteClient(engineContext, getWriteConfig(true, true))) {
+      // Write 1
+      String newCommitTime = "001";
+      List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 20);
+      client.startCommitWithTime(newCommitTime);
+      List<WriteStatus> writeStatuses = client.bulkInsert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+
+      // Write 2
+      newCommitTime = "002";
+      client.startCommitWithTime(newCommitTime);
+      records = dataGen.generateInserts(newCommitTime, 20);
+      writeStatuses = client.insert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+    }
+
+    // At this time both commits 001 and 002 must be synced to the metadata table
+    HoodieTableMetaClient metadataMetaClient = HoodieTableMetaClient.builder().setConf(hadoopConf).setBasePath(metadataTableBasePath).build();
+    HoodieActiveTimeline timeline = metadataMetaClient.getActiveTimeline();
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "001")));
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "002")));
+
+    // Delete the 002 deltacommit completed instant to make it inflight
+    FileCreateUtils.deleteDeltaCommit(metadataTableBasePath, "002");

Review comment:
       @vinothchandar however, if 002 failed, it should not sync to metadata table, right? I see it synced to metadata table. should we just create inflight 002 inflight to prevent it synced to metadata table?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894719689


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894715304


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#discussion_r685266163



##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -857,6 +859,62 @@ public void testMetadataOutOfSync() throws Exception {
     validateMetadata(unsyncedClient);
   }
 
+  /**
+   * Test that failure to perform deltacommit on the metadata table does not lead to missed sync.
+   */
+  @Test
+  public void testMetdataTableCommitFailure() throws Exception {
+    init(HoodieTableType.COPY_ON_WRITE);
+    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+    try (SparkRDDWriteClient client = new SparkRDDWriteClient(engineContext, getWriteConfig(true, true))) {
+      // Write 1
+      String newCommitTime = "001";
+      List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 20);
+      client.startCommitWithTime(newCommitTime);
+      List<WriteStatus> writeStatuses = client.bulkInsert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+
+      // Write 2
+      newCommitTime = "002";
+      client.startCommitWithTime(newCommitTime);
+      records = dataGen.generateInserts(newCommitTime, 20);
+      writeStatuses = client.insert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+    }
+
+    // At this time both commits 001 and 002 must be synced to the metadata table
+    HoodieTableMetaClient metadataMetaClient = HoodieTableMetaClient.builder().setConf(hadoopConf).setBasePath(metadataTableBasePath).build();
+    HoodieActiveTimeline timeline = metadataMetaClient.getActiveTimeline();
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "001")));
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "002")));
+
+    // Delete the 002 deltacommit completed instant to make it inflight
+    FileCreateUtils.deleteDeltaCommit(metadataTableBasePath, "002");

Review comment:
       would you please clarify in which situation will `001` and `002` committed and synced to metadata table successfully and then `002` becomes inflight?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894708976


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#discussion_r686389090



##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -857,6 +859,62 @@ public void testMetadataOutOfSync() throws Exception {
     validateMetadata(unsyncedClient);
   }
 
+  /**
+   * Test that failure to perform deltacommit on the metadata table does not lead to missed sync.
+   */
+  @Test
+  public void testMetdataTableCommitFailure() throws Exception {
+    init(HoodieTableType.COPY_ON_WRITE);
+    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+    try (SparkRDDWriteClient client = new SparkRDDWriteClient(engineContext, getWriteConfig(true, true))) {
+      // Write 1
+      String newCommitTime = "001";
+      List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 20);

Review comment:
       wondering if we can use any existing test framework helpers to do these commits. Feel like this is repeated a lot in this file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894713842


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1469) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894730553


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894703792


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on a change in pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#discussion_r686940740



##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -857,6 +859,62 @@ public void testMetadataOutOfSync() throws Exception {
     validateMetadata(unsyncedClient);
   }
 
+  /**
+   * Test that failure to perform deltacommit on the metadata table does not lead to missed sync.
+   */
+  @Test
+  public void testMetdataTableCommitFailure() throws Exception {
+    init(HoodieTableType.COPY_ON_WRITE);
+    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+    try (SparkRDDWriteClient client = new SparkRDDWriteClient(engineContext, getWriteConfig(true, true))) {
+      // Write 1
+      String newCommitTime = "001";
+      List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 20);
+      client.startCommitWithTime(newCommitTime);
+      List<WriteStatus> writeStatuses = client.bulkInsert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+
+      // Write 2
+      newCommitTime = "002";
+      client.startCommitWithTime(newCommitTime);
+      records = dataGen.generateInserts(newCommitTime, 20);
+      writeStatuses = client.insert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+    }
+
+    // At this time both commits 001 and 002 must be synced to the metadata table
+    HoodieTableMetaClient metadataMetaClient = HoodieTableMetaClient.builder().setConf(hadoopConf).setBasePath(metadataTableBasePath).build();
+    HoodieActiveTimeline timeline = metadataMetaClient.getActiveTimeline();
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "001")));
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "002")));
+
+    // Delete the 002 deltacommit completed instant to make it inflight
+    FileCreateUtils.deleteDeltaCommit(metadataTableBasePath, "002");

Review comment:
       @vinothchandar however, if 002 failed, it should not sync to metadata table, right? I see it synced to metadata table. should we just create inflight 002 inflight to prevent it synced to metadata table?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894724690


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894734897


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#issuecomment-894636270


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464",
       "triggerID" : "d667125f5968743f4842d5de716cd4f734a5e0f4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b517d8efd4ac5af5f75dac68e26a5c409497b355",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d667125f5968743f4842d5de716cd4f734a5e0f4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1464) 
   * b517d8efd4ac5af5f75dac68e26a5c409497b355 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar merged pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
vinothchandar merged pull request #3428:
URL: https://github.com/apache/hudi/pull/3428


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3428: [HUDI-2286] Handle the case of failed deltacommit on the metadata table.

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3428:
URL: https://github.com/apache/hudi/pull/3428#discussion_r686389337



##########
File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -857,6 +859,62 @@ public void testMetadataOutOfSync() throws Exception {
     validateMetadata(unsyncedClient);
   }
 
+  /**
+   * Test that failure to perform deltacommit on the metadata table does not lead to missed sync.
+   */
+  @Test
+  public void testMetdataTableCommitFailure() throws Exception {
+    init(HoodieTableType.COPY_ON_WRITE);
+    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+    try (SparkRDDWriteClient client = new SparkRDDWriteClient(engineContext, getWriteConfig(true, true))) {
+      // Write 1
+      String newCommitTime = "001";
+      List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 20);
+      client.startCommitWithTime(newCommitTime);
+      List<WriteStatus> writeStatuses = client.bulkInsert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+
+      // Write 2
+      newCommitTime = "002";
+      client.startCommitWithTime(newCommitTime);
+      records = dataGen.generateInserts(newCommitTime, 20);
+      writeStatuses = client.insert(jsc.parallelize(records, 1), newCommitTime).collect();
+      assertNoWriteErrors(writeStatuses);
+    }
+
+    // At this time both commits 001 and 002 must be synced to the metadata table
+    HoodieTableMetaClient metadataMetaClient = HoodieTableMetaClient.builder().setConf(hadoopConf).setBasePath(metadataTableBasePath).build();
+    HoodieActiveTimeline timeline = metadataMetaClient.getActiveTimeline();
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "001")));
+    assertTrue(timeline.containsInstant(new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, "002")));
+
+    // Delete the 002 deltacommit completed instant to make it inflight
+    FileCreateUtils.deleteDeltaCommit(metadataTableBasePath, "002");

Review comment:
       This is just simulating a failure, as if 002 failed, by deleting the delta commit file. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org