You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/25 17:55:46 UTC

[GitHub] [hudi] guanziyue opened a new pull request #4913: [WIP][HUDI-1517] create marker file for every log file

guanziyue opened a new pull request #4913:
URL: https://github.com/apache/hudi/pull/4913


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   Create marker file for every log file. It will improvement perf of rollback of log file when marker file based rollback is enabled. It will also help cleaning partial generated log file which can be deleted. HUDI -3026 depends on this change.
   
   https://issues.apache.org/jira/browse/HUDI-1517
   
   ## Brief change log
   
   Add a callback to HoodieLogFormatWriter so that we can inject mark file creation when log file is created.
   
   Change reconcile logic in HoodieTable so that partial created log file can be deleted.
   
   Add rollback logic in markerFileRollbackStrategy so that we can use markfile to rollback log blocks.
   
   ## Verify this pull request
   
   TBD: need some test to cover this change.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053703958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066153819


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066163017


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1065207954


   sorry. will review in a day or two.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066152922


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051071968


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051565083


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051564339


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   * 875ec8b00cd379e669498fe7575503b192f0de5e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051564339


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   * 875ec8b00cd379e669498fe7575503b192f0de5e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051223344


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051074112


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066153385


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066152375


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1065343231


   @nsivabalan By the way, the following case may be handled in a more efficient way if we have marker for each log. Looking forward to your opinion.
   In one commit, 
   task attempt 1 create a log file .fileId_instantTime.log.2_0-0-1. But this task attempt failed.
   task attempt 2 create a new log file .fileId_instantTime.log.2_0-0-2. This task attempt succeeds and .fileId_instantTime.log.2_0-0-2 is returned in writeStat.
   
   Before: log file is fail-safe. So we preserve both of them and it should be idempotent to handle the content twice. 
   Now: we can delete .fileId_instantTime.log.2_0-0-1 in 'reconcileAgainstMarkers' phase because it is an invalid file, which improves perf little. But this way no longer treat log file fail-safe totally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824879591



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       correct me if my understanding is wrong. thought this will get simplified a lot when we start generating maker file per log file. 
   Just from the marker file path name, can't we deduce the log file that needs to be rolled back?  
   prior to this patch, we don't create maker file w/ log file info, and hence we had to deduce the latest log file. but now, with this patch, wouldn't that get simplified. 
   can you help me understand. 
   

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       or am I missing something 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066152922


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051783953


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   * 6816a4b47b88108172b46fece160e4e078345687 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051595547


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053747305


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053681646


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053682593


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051082324


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051595547


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051116297


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051116297


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053784297


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053747305


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051074112


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051071968


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051223344


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue removed a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1065343231


   @nsivabalan By the way, the following case may be handled in a more efficient way if we have this PR. Looking forward to your opinion.
   In one commit, 
   task attempt 1 create a log file .fileId_instantTime.log.2_0-0-1. But this task attempt failed.
   task attempt 2 create a new log file .fileId_instantTime.log.2_0-0-2. This task attempt succeeds and .fileId_instantTime.log.2_0-0-2 is returned in writeStat.
   
   Before: log file is fail-safe. So we preserve both of them and it should be idempotent to handle the content twice. 
   Now: we can delete .fileId_instantTime.log.2_0-0-1 in 'reconcileAgainstMarkers' phase because it is an invalid file, which improves perf little. But this way no longer treat log file fail-safe totally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue edited a comment on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue edited a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1065343231


   @nsivabalan By the way, the following case may be handled in a more efficient way if we have this PR. Looking forward to your opinion.
   In one commit, 
   task attempt 1 create a log file .fileId_instantTime.log.2_0-0-1. But this task attempt failed.
   task attempt 2 create a new log file .fileId_instantTime.log.2_0-0-2. This task attempt succeeds and .fileId_instantTime.log.2_0-0-2 is returned in writeStat.
   
   Before: log file is fail-safe. So we preserve both of them and it should be idempotent to handle the content twice. 
   Now: we can delete .fileId_instantTime.log.2_0-0-1 in 'reconcileAgainstMarkers' phase because it is an invalid file, which improves perf little. But this way no longer treat log file fail-safe totally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066173400


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824933880



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       Hi nsivabalan, I think we actually get it simplified because we no longer need file Listing of log files anymore. However, the code looks more complex. There are two reasons.
   1. For the change of the way how marker file generation in 0.11, I choose to keep backward compatible rather than regenerating markers in UpgradeAndDownGrade. So all code was preserved.
   2. For deduce the latest log file, we generate two log file in one commit due to rollover. We have two log files and two markers. Only one of them is the latest log file. I have to write some code to find it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824933880



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       Hi nsivabalan, I think we actually get it simplified because we no longer need file Listing of log files anymore. However, the code looks more complex. There are two reasons.
   1. For change of the way how marker file generation, I choose to keep backward compatible rather than regenerating markers in UpgradeAndDownGrade. 
   2. Assuming this case, we generate two log file in one commit due to rollover. We have two log files and two markers. According to current rollback mechanism, LogFiles was treated as a stack of logBlocks rather than an array of log blocks. I need to append command block to the end of latest log file instead of inserting several command blocks into this stack though such inserting seems also works. So there is a lot of code to do marker comparison.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066153385


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051787441


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053682593


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053703958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824933880



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       Hi nsivabalan, I think we actually get it simplified because we no longer need file Listing of log files anymore. However, the code looks more complex. There are two reasons.
   1. For the change of the way how marker file generation in 0.11, I choose to keep backward compatible rather than regenerating markers in UpgradeAndDownGrade. So previous code was preserved.
   2. For deduce the latest log file, we may generate more than one log file in one commit due to rollover. Only one of them is the latest log file. I have to write some code to find it. This all in memory logic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824933880



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       Hi nsivabalan, I think we actually get it simplified because we no longer need file Listing of log files anymore. However, the code looks more complex. There are two reasons.
   1. For the change of the way how marker file generation in 0.11, I choose to keep backward compatible rather than regenerating markers in UpgradeAndDownGrade. So all code was preserved.
   2. Assuming this case, we generate two log file in one commit due to rollover. We have two log files and two markers. According to current rollback mechanism, LogFiles was treated as a stack of logBlocks rather than an array of log blocks. I need to append command block to the end of latest log file instead of inserting several command blocks into this stack though such inserting seems also works. So there is a lot of code to do marker comparison.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051084435


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053746046


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053786323


   Hi @yihua
   This PR is ready for review. Could you have a look when you have spare time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YuweiXiao commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
YuweiXiao commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824318662



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java
##########
@@ -113,22 +116,37 @@
   // Header metadata for a log block
   protected final Map<HeaderMetadataType, String> header = new HashMap<>();
   private SizeEstimator<HoodieRecord> sizeEstimator;
+  protected final WriteMarkers writeMarkers;
+  private final IOType ioType;
 
   private Properties recordProperties = new Properties();
 
   public HoodieAppendHandle(HoodieWriteConfig config, String instantTime, HoodieTable<T, I, K, O> hoodieTable,
-                            String partitionPath, String fileId, Iterator<HoodieRecord<T>> recordItr, TaskContextSupplier taskContextSupplier) {
+                            String partitionPath, String fileId, Iterator<HoodieRecord<T>> recordItr,
+                            TaskContextSupplier taskContextSupplier, IOType ioType) {
     super(config, instantTime, partitionPath, fileId, hoodieTable, taskContextSupplier);
     this.fileId = fileId;
     this.recordItr = recordItr;
     sizeEstimator = new DefaultSizeEstimator();
     this.statuses = new ArrayList<>();
     this.recordProperties.putAll(config.getProps());
+    this.writeMarkers = WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime);
+    this.ioType = ioType;
   }
 
+  // constructor used for creating new file group
   public HoodieAppendHandle(HoodieWriteConfig config, String instantTime, HoodieTable<T, I, K, O> hoodieTable,
                             String partitionPath, String fileId, TaskContextSupplier sparkTaskContextSupplier) {
-    this(config, instantTime, hoodieTable, partitionPath, fileId, null, sparkTaskContextSupplier);
+    this(config, instantTime, hoodieTable, partitionPath, fileId, null, sparkTaskContextSupplier,
+        IOType.CREATE);

Review comment:
       Just curious, in which case we will use IOType.CREATE for a AppendHandle?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051787441


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051907113


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051082324


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051783953


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   * 6816a4b47b88108172b46fece160e4e078345687 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824411709



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java
##########
@@ -113,22 +116,37 @@
   // Header metadata for a log block
   protected final Map<HeaderMetadataType, String> header = new HashMap<>();
   private SizeEstimator<HoodieRecord> sizeEstimator;
+  protected final WriteMarkers writeMarkers;
+  private final IOType ioType;
 
   private Properties recordProperties = new Properties();
 
   public HoodieAppendHandle(HoodieWriteConfig config, String instantTime, HoodieTable<T, I, K, O> hoodieTable,
-                            String partitionPath, String fileId, Iterator<HoodieRecord<T>> recordItr, TaskContextSupplier taskContextSupplier) {
+                            String partitionPath, String fileId, Iterator<HoodieRecord<T>> recordItr,
+                            TaskContextSupplier taskContextSupplier, IOType ioType) {
     super(config, instantTime, partitionPath, fileId, hoodieTable, taskContextSupplier);
     this.fileId = fileId;
     this.recordItr = recordItr;
     sizeEstimator = new DefaultSizeEstimator();
     this.statuses = new ArrayList<>();
     this.recordProperties.putAll(config.getProps());
+    this.writeMarkers = WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime);
+    this.ioType = ioType;
   }
 
+  // constructor used for creating new file group
   public HoodieAppendHandle(HoodieWriteConfig config, String instantTime, HoodieTable<T, I, K, O> hoodieTable,
                             String partitionPath, String fileId, TaskContextSupplier sparkTaskContextSupplier) {
-    this(config, instantTime, hoodieTable, partitionPath, fileId, null, sparkTaskContextSupplier);
+    this(config, instantTime, hoodieTable, partitionPath, fileId, null, sparkTaskContextSupplier,
+        IOType.CREATE);

Review comment:
       For indexes which have attribute canindexLogFile. Currently, HbaseIndex, Flink State index and memory Index has this attribute.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066152375


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   * f0c65b8a748be13de662c8d438da8d30c04f9055 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053681646


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053746046


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4306f2a75ddbb6c0eca86d8fcf35c889d923b557 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372) 
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] guanziyue commented on a change in pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
guanziyue commented on a change in pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r824933880



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -73,54 +80,110 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> table, HoodieEngineCo
       List<String> markerPaths = MarkerBasedRollbackUtils.getAllMarkerPaths(
           table, context, instantToRollback.getTimestamp(), config.getRollbackParallelism());
       int parallelism = Math.max(Math.min(markerPaths.size(), config.getRollbackParallelism()), 1);
-      return context.map(markerPaths, markerFilePath -> {
-        String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
-        IOType type = IOType.valueOf(typeStr);
-        switch (type) {
-          case MERGE:
-          case CREATE:
-            String fileToDelete = WriteMarkers.stripMarkerSuffix(markerFilePath);
-            Path fullDeletePath = new Path(basePath, fileToDelete);
-            String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullDeletePath.getParent());
-            return new HoodieRollbackRequest(partitionPath, EMPTY_STRING, EMPTY_STRING,
-                Collections.singletonList(fullDeletePath.toString()),
-                Collections.emptyMap());
-          case APPEND:
-            // NOTE: This marker file-path does NOT correspond to a log-file, but rather is a phony
-            //       path serving as a "container" for the following components:
-            //          - Base file's file-id
-            //          - Base file's commit instant
-            //          - Partition path
-            return getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
-          default:
-            throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
-        }
-      }, parallelism);
+      return context.mapToPairAndReduceByKey(markerPaths,
+          // generate rollback request per marker file
+          getRollbackReqGenerateFunction(instantToRollback),
+          // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
+          //       block to the latest log-file. But we cannot simply get the latest log-file by one marker file.
+          //       So compare log-files in the same fileGroup and get the latest one.
+          getRollbackReqCombineFunction(), parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
-    String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
-    String baseCommitTime = FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String relativePartitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), baseFilePathForAppend.getParent());
-    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), relativePartitionPath);
-
-    // NOTE: Since we're rolling back incomplete Delta Commit, it only could have appended its
-    //       block to the latest log-file
-    // TODO(HUDI-1517) use provided marker-file's path instead
-    HoodieLogFile latestLogFile = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
-        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
-
-    // NOTE: Marker's don't carry information about the cumulative size of the blocks that have been appended,
-    //       therefore we simply stub this value.
-    Map<String, Long> logFilesWithBlocsToRollback =
-        Collections.singletonMap(latestLogFile.getFileStatus().getPath().toString(), -1L);
+  private SerializablePairFunction<String, Pair<String, String>, HoodieRollbackRequest> getRollbackReqGenerateFunction(
+      HoodieInstant instantToRollback) {
+    return markerFilePath -> {
+      String typeStr = markerFilePath.substring(markerFilePath.lastIndexOf(".") + 1);
+      IOType type = IOType.valueOf(typeStr);
+      String partitionFilePath = WriteMarkers.stripMarkerSuffix(markerFilePath);
+      Path fullFilePath = new Path(basePath, partitionFilePath);
+      String partitionPath = FSUtils.getRelativePartitionPath(new Path(basePath), fullFilePath.getParent());
+      switch (type) {
+        case MERGE:
+        case CREATE:
+          HoodieBaseFile baseFileToDelete = new HoodieBaseFile(fullFilePath.toString());
+          String fileId = baseFileToDelete.getFileId();
+          String baseInstantTime = baseFileToDelete.getCommitTime();
+          return Pair.of(Pair.of(partitionPath, fileId),
+              new HoodieRollbackRequest(partitionPath, fileId, baseInstantTime,
+                  Collections.singletonList(fullFilePath.toString()),
+                  Collections.emptyMap()));
+        case APPEND:
+          HoodieRollbackRequest rollbackRequestForAppend = getRollbackRequestForAppend(partitionFilePath);
+          return Pair.of(Pair.of(partitionPath, rollbackRequestForAppend.getFileId()),
+              rollbackRequestForAppend);
+        default:
+          throw new HoodieRollbackException("Unknown marker type, during rollback of " + instantToRollback);
+      }
+    };
+  }
+
+  private SerializableBiFunction<HoodieRollbackRequest, HoodieRollbackRequest, HoodieRollbackRequest> getRollbackReqCombineFunction() {
+    return (rollbackReq1, rollbackReq2) -> {
+      List<String> filesToBeDeleted = new LinkedList<>();
+      filesToBeDeleted.addAll(rollbackReq1.getFilesToBeDeleted());
+      filesToBeDeleted.addAll(rollbackReq2.getFilesToBeDeleted());
+      final Comparator<HoodieLogFile> logFileComparator = HoodieLogFile.getLogFileComparator();
+      HoodieLogFile latestLogFile = null;
+      long latestLogFileLen = -1;
+
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq1.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
 
+      for (Map.Entry<String, Long> pathLengthPair : rollbackReq2.getLogBlocksToBeDeleted().entrySet()) {
+        HoodieLogFile candidateLogFile = new HoodieLogFile(pathLengthPair.getKey());
+        if (latestLogFile == null || logFileComparator.compare(latestLogFile, candidateLogFile) >= 0) {
+          latestLogFile = candidateLogFile;
+          latestLogFileLen = pathLengthPair.getValue();
+        }
+      }
+      return new HoodieRollbackRequest(rollbackReq1.getPartitionPath(), rollbackReq1.getFileId(),
+          rollbackReq1.getLatestBaseInstant(), filesToBeDeleted,
+          latestLogFile == null ? Collections.emptyMap() :
+              Collections.singletonMap(latestLogFile.getPath().toString(), latestLogFileLen));
+    };
+  }
+
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePath) {
+    Path filePath = new Path(basePath, markerFilePath);

Review comment:
       Hi nsivabalan, I think we actually get it simplified because we no longer need file Listing of log files anymore. However, the code looks more complex. There are two reasons.
   1. For the change of the way how marker file generation in 0.11, I choose to keep backward compatible rather than regenerating markers in UpgradeAndDownGrade. 
   2. Assuming this case, we generate two log file in one commit due to rollover. We have two log files and two markers. According to current rollback mechanism, LogFiles was treated as a stack of logBlocks rather than an array of log blocks. I need to append command block to the end of latest log file instead of inserting several command blocks into this stack though such inserting seems also works. So there is a lot of code to do marker comparison.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051565083


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328) 
   * 875ec8b00cd379e669498fe7575503b192f0de5e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051907113


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6816a4b47b88108172b46fece160e4e078345687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [WIP][HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1051084435


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5add7ebb8081671f5816d82d4472d5ea1f7d8338 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326) 
   * ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1053784297


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0c81e53f83757b49d9e5f5df3bbeebdd076b99b5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066163017


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6896) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4913: [HUDI-1517] create marker file for every log file

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4913:
URL: https://github.com/apache/hudi/pull/4913#issuecomment-1066153819


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6326",
       "triggerID" : "5add7ebb8081671f5816d82d4472d5ea1f7d8338",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6328",
       "triggerID" : "ea1621d1d17e2c85fe9f69f6b39aaa08f61871d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6341",
       "triggerID" : "875ec8b00cd379e669498fe7575503b192f0de5e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6816a4b47b88108172b46fece160e4e078345687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6345",
       "triggerID" : "6816a4b47b88108172b46fece160e4e078345687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6372",
       "triggerID" : "4306f2a75ddbb6c0eca86d8fcf35c889d923b557",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6373",
       "triggerID" : "0c81e53f83757b49d9e5f5df3bbeebdd076b99b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895",
       "triggerID" : "f0c65b8a748be13de662c8d438da8d30c04f9055",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6560aaa7f15eed0a585b03d92d37127a34791b74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f0c65b8a748be13de662c8d438da8d30c04f9055 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6895) 
   * 6560aaa7f15eed0a585b03d92d37127a34791b74 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org