You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "pushpavanthar (via GitHub)" <gi...@apache.org> on 2023/02/02 18:32:36 UTC

[GitHub] [hudi] pushpavanthar opened a new pull request, #7828: [HUDI-5686] Fixes data loss due to rollbacks

pushpavanthar opened a new pull request, #7828:
URL: https://github.com/apache/hudi/pull/7828

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   Refer to JIRA HUDI-5686 for detailed description of the issue.
   Approach here is to avoid creation of new instants (timestamps) for rollbacks when there are rollbacks of incomplete commits in the timeline created by previous runs. Instead of creating new instants, I'm reusing the timestamp to create rollback instant which abides with the chronological order of the commits.
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   No breaking changes
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   Low, verified data correctness against source database in production for 50+ HoodieDeltaStreamer jobs running in both batch and continuous modes. 
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [x] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7828:
URL: https://github.com/apache/hudi/pull/7828#discussion_r1095419184


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -734,7 +734,7 @@ protected List<String> getInstantsToRollback(HoodieTableMetaClient metaClient, H
   @Deprecated
   public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException {
     LOG.info("Begin rollback of instant " + commitInstantTime);
-    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(HoodieActiveTimeline.createNewInstantTime());
+    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(commitInstantTime);
     final Timer.Context timerContext = this.metrics.getRollbackCtx();

Review Comment:
   I'm confused, why rollback the instant itself under the current transaction.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1416826551

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f84a6df3dc222fc71a10b8b1771c67d73d727c83 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922) 
   * 2ec476d8ac621dcb87b10cad701f6bb3febac8a4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on a diff in pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on code in PR #7828:
URL: https://github.com/apache/hudi/pull/7828#discussion_r1095685165


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -734,7 +734,7 @@ protected List<String> getInstantsToRollback(HoodieTableMetaClient metaClient, H
   @Deprecated
   public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException {
     LOG.info("Begin rollback of instant " + commitInstantTime);
-    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(HoodieActiveTimeline.createNewInstantTime());
+    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(commitInstantTime);
     final Timer.Context timerContext = this.metrics.getRollbackCtx();

Review Comment:
   Hi @danny0405, i just realised that this way of picking timestamp from instant itself will make features like `rollbackToInstant` fail. This approach fails to place rollback instant in chronological order in timeline. 
   I'll update the PR with below approach, 
   use approach similar to below method, which actually takes care of rollbacks first before creating instant for the commit.
   `org.apache.hudi.client.BaseHoodieWriteClient#startCommit(java.lang.String, org.apache.hudi.common.table.HoodieTableMetaClient)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1416831210

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1414400144

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f4cebdfce11e3cecf213da41fc88f82e2565d0e1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1416724996

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f84a6df3dc222fc71a10b8b1771c67d73d727c83 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1414204438

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f4cebdfce11e3cecf213da41fc88f82e2565d0e1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418836902

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929",
       "triggerID" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14953",
       "triggerID" : "1418608661",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 2ec476d8ac621dcb87b10cad701f6bb3febac8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14953) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1416694522

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f4cebdfce11e3cecf213da41fc88f82e2565d0e1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880) 
   * f84a6df3dc222fc71a10b8b1771c67d73d727c83 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1421922700

   Close it because it had been fixed, feel to re-open when you find any issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418384289

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929",
       "triggerID" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2ec476d8ac621dcb87b10cad701f6bb3febac8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418613470

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929",
       "triggerID" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2ec476d8ac621dcb87b10cad701f6bb3febac8a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14953",
       "triggerID" : "1418608661",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 2ec476d8ac621dcb87b10cad701f6bb3febac8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14929) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14953) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on a diff in pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on code in PR #7828:
URL: https://github.com/apache/hudi/pull/7828#discussion_r1095685165


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -734,7 +734,7 @@ protected List<String> getInstantsToRollback(HoodieTableMetaClient metaClient, H
   @Deprecated
   public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException {
     LOG.info("Begin rollback of instant " + commitInstantTime);
-    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(HoodieActiveTimeline.createNewInstantTime());
+    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(commitInstantTime);
     final Timer.Context timerContext = this.metrics.getRollbackCtx();

Review Comment:
   Hi @danny0405, i just realised that this way of picking timestamp from instant itself will make features like `rollbackToInstant` fail. This approach fails to place rollback instant in chronological order in timeline. 
   I'll resubmit the PR with below approach, 
   use approach similar to below method, which actually takes care of rollbacks first before creating instant for the commit.
   `org.apache.hudi.client.BaseHoodieWriteClient#startCommit(java.lang.String, org.apache.hudi.common.table.HoodieTableMetaClient)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1414193238

   @codope ekindly review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1414196399

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f4cebdfce11e3cecf213da41fc88f82e2565d0e1 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1416695478

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880",
       "triggerID" : "f4cebdfce11e3cecf213da41fc88f82e2565d0e1",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922",
       "triggerID" : "f84a6df3dc222fc71a10b8b1771c67d73d727c83",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f4cebdfce11e3cecf213da41fc88f82e2565d0e1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14880) 
   * f84a6df3dc222fc71a10b8b1771c67d73d727c83 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14922) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pratyakshsharma (via GitHub)" <gi...@apache.org>.
pratyakshsharma commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418608661

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1421714415

   Can you close out the patch if its not valid. I assume you are testing it w/ hudi 0.12.0 or higher version. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418780986

   The dataloss is not because of rollback, it is the timeline server refresh is problematic for release 0.11.x, I have put a fix in release 0.12.0: https://github.com/apache/hudi/pull/6179


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on a diff in pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on code in PR #7828:
URL: https://github.com/apache/hudi/pull/7828#discussion_r1095685165


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -734,7 +734,7 @@ protected List<String> getInstantsToRollback(HoodieTableMetaClient metaClient, H
   @Deprecated
   public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException {
     LOG.info("Begin rollback of instant " + commitInstantTime);
-    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(HoodieActiveTimeline.createNewInstantTime());
+    final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(commitInstantTime);
     final Timer.Context timerContext = this.metrics.getRollbackCtx();

Review Comment:
   Hi @danny0405, i just realised that this way of picking timestamp from instant itself will make features like `rollbackToInstant` fail. This approach fails in placing rollback instant in chronological order in timeline. 
   I'll resubmit the PR with below approach, 
   use approach similar to below method, which actually takes care of rollbacks first before creating instant for the commit.
   `org.apache.hudi.client.BaseHoodieWriteClient#startCommit(java.lang.String, org.apache.hudi.common.table.HoodieTableMetaClient)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pushpavanthar commented on pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "pushpavanthar (via GitHub)" <gi...@apache.org>.
pushpavanthar commented on PR #7828:
URL: https://github.com/apache/hudi/pull/7828#issuecomment-1418585790

   @danny0405 @nsivabalan can you please take a look at the revised changes?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 closed pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 closed pull request #7828: [HUDI-5686] Fixes data loss due to rollbacks
URL: https://github.com/apache/hudi/pull/7828


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org