You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/23 03:40:55 UTC

[GitHub] [hudi] alexeykudinkin opened a new pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

alexeykudinkin opened a new pull request #4877:
URL: https://github.com/apache/hudi/pull/4877


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Refactoring Spark DataSource Relations to avoid code duplication. Following Relations were in scope:
   
    - `BaseFileOnlyViewRelation`
    - `MergeOnReadSnapshotRelaation`
    - `MergeOnReadIncrementalRelation`
   
   ## Brief change log
   
   See above
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065784582


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050326745


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050367407


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068567116


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068661514


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1072945902


   manually tested this patch in spark 3.2.1, using quickstart examples, and passed. landing this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] XuQianJin-Stars commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1072911431


   +1 LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064716174


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050367407


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048425436


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048455568


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068703912


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067572461


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r825222033



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hudi.HoodieBaseRelation.createBaseFileReader
+import org.apache.hudi.common.table.HoodieTableMetaClient
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.sources.{BaseRelation, Filter}
+import org.apache.spark.sql.types.StructType
+
+/**
+ * [[BaseRelation]] implementation only reading Base files of Hudi tables, essentially supporting following querying
+ * modes:
+ * <ul>
+ * <li>For COW tables: Snapshot</li>
+ * <li>For MOR tables: Read-optimized</li>
+ * </ul>
+ *
+ * NOTE: The reason this Relation is used in liue of Spark's default [[HadoopFsRelation]] is primarily due to the
+ * fact that it injects real partition's path as the value of the partition field, which Hudi ultimately persists
+ * as part of the record payload. In some cases, however, partition path might not necessarily be equal to the
+ * verbatim value of the partition path field (when custom [[KeyGenerator]] is used) therefore leading to incorrect
+ * partition field values being written
+ */
+class BaseFileOnlyRelation(sqlContext: SQLContext,

Review comment:
       Interesting, that's a good point. Let me try to do exactly in that order next time. 
   
   Weirdly enough, not sure why order should matter in that case.

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieDataSourceHelper.scala
##########
@@ -65,20 +65,6 @@ object HoodieDataSourceHelper extends PredicateHelper {
     }
   }
 
-  /**
-   * Extract the required schema from [[InternalRow]]
-   */
-  def extractRequiredSchema(

Review comment:
       Correct

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/HoodieHadoopFSUtils.scala
##########
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.viewfs.ViewFileSystem
+import org.apache.hadoop.fs._
+import org.apache.hadoop.hdfs.DistributedFileSystem
+import org.apache.spark.internal.Logging
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.util.SerializableConfiguration
+
+import java.io.FileNotFoundException
+import scala.collection.mutable
+
+/**
+ * NOTE: This method class is replica of HadoopFSUtils from Spark 3.2.1, with the following adjustments
+ *
+ *    - Filtering out of the listed files is adjusted to include files starting w/ "." (to include Hoodie Delta Log
+ *    files)
+ */
+object HoodieHadoopFSUtils extends Logging {
+  /**
+   * Lists a collection of paths recursively. Picks the listing strategy adaptively depending
+   * on the number of paths to list.
+   *
+   * This may only be called on the driver.
+   *
+   * @param sc                   Spark context used to run parallel listing.
+   * @param paths                Input paths to list
+   * @param hadoopConf           Hadoop configuration
+   * @param filter               Path filter used to exclude leaf files from result
+   * @param ignoreMissingFiles   Ignore missing files that happen during recursive listing
+   *                             (e.g., due to race conditions)
+   * @param ignoreLocality       Whether to fetch data locality info when listing leaf files. If false,
+   *                             this will return `FileStatus` without `BlockLocation` info.
+   * @param parallelismThreshold The threshold to enable parallelism. If the number of input paths
+   *                             is smaller than this value, this will fallback to use
+   *                             sequential listing.
+   * @param parallelismMax       The maximum parallelism for listing. If the number of input paths is
+   *                             larger than this value, parallelism will be throttled to this value
+   *                             to avoid generating too many tasks.
+   * @return for each input path, the set of discovered files for the path
+   */
+  def parallelListLeafFiles(sc: SparkContext,
+                            paths: Seq[Path],
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+    parallelListLeafFilesInternal(sc, paths, hadoopConf, filter, isRootLevel = true,
+      ignoreMissingFiles, ignoreLocality, parallelismThreshold, parallelismMax)
+  }
+
+  // scalastyle:off parameter.number
+  private def parallelListLeafFilesInternal(sc: SparkContext,
+                                            paths: Seq[Path],
+                                            hadoopConf: Configuration,
+                                            filter: PathFilter,
+                                            isRootLevel: Boolean,
+                                            ignoreMissingFiles: Boolean,
+                                            ignoreLocality: Boolean,
+                                            parallelismThreshold: Int,
+                                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+
+    // Short-circuits parallel listing when serial listing is likely to be faster.
+    if (paths.size <= parallelismThreshold) {
+      // scalastyle:off return
+      return paths.map { path =>
+        val leafFiles = listLeafFiles(
+          path,
+          hadoopConf,
+          filter,
+          Some(sc),
+          ignoreMissingFiles = ignoreMissingFiles,
+          ignoreLocality = ignoreLocality,
+          isRootPath = isRootLevel,
+          parallelismThreshold = parallelismThreshold,
+          parallelismMax = parallelismMax)
+        (path, leafFiles)
+      }
+      // scalastyle:on return
+    }
+
+    logInfo(s"Listing leaf files and directories in parallel under ${paths.length} paths." +
+      s" The first several paths are: ${paths.take(10).mkString(", ")}.")
+    HiveCatalogMetrics.incrementParallelListingJobCount(1)
+
+    val serializableConfiguration = new SerializableConfiguration(hadoopConf)
+    val serializedPaths = paths.map(_.toString)
+
+    // Set the number of parallelism to prevent following file listing from generating many tasks
+    // in case of large #defaultParallelism.
+    val numParallelism = Math.min(paths.size, parallelismMax)
+
+    val previousJobDescription = sc.getLocalProperty(SparkContext.SPARK_JOB_DESCRIPTION)
+    val statusMap = try {
+      val description = paths.size match {
+        case 0 =>
+          "Listing leaf files and directories 0 paths"
+        case 1 =>
+          s"Listing leaf files and directories for 1 path:<br/>${paths(0)}"
+        case s =>
+          s"Listing leaf files and directories for $s paths:<br/>${paths(0)}, ..."
+      }
+      sc.setJobDescription(description)
+      sc
+        .parallelize(serializedPaths, numParallelism)
+        .mapPartitions { pathStrings =>
+          val hadoopConf = serializableConfiguration.value
+          pathStrings.map(new Path(_)).toSeq.map { path =>
+            val leafFiles = listLeafFiles(
+              path = path,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = None, // Can't execute parallel scans on workers
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = isRootLevel,
+              parallelismThreshold = Int.MaxValue,
+              parallelismMax = 0)
+            (path, leafFiles)
+          }.iterator
+        }.map { case (path, statuses) =>
+        val serializableStatuses = statuses.map { status =>
+          // Turn FileStatus into SerializableFileStatus so we can send it back to the driver
+          val blockLocations = status match {
+            case f: LocatedFileStatus =>
+              f.getBlockLocations.map { loc =>
+                SerializableBlockLocation(
+                  loc.getNames,
+                  loc.getHosts,
+                  loc.getOffset,
+                  loc.getLength)
+              }
+
+            case _ =>
+              Array.empty[SerializableBlockLocation]
+          }
+
+          SerializableFileStatus(
+            status.getPath.toString,
+            status.getLen,
+            status.isDirectory,
+            status.getReplication,
+            status.getBlockSize,
+            status.getModificationTime,
+            status.getAccessTime,
+            blockLocations)
+        }
+        (path.toString, serializableStatuses)
+      }.collect()
+    } finally {
+      sc.setJobDescription(previousJobDescription)
+    }
+
+    // turn SerializableFileStatus back to Status
+    statusMap.map { case (path, serializableStatuses) =>
+      val statuses = serializableStatuses.map { f =>
+        val blockLocations = f.blockLocations.map { loc =>
+          new BlockLocation(loc.names, loc.hosts, loc.offset, loc.length)
+        }
+        new LocatedFileStatus(
+          new FileStatus(
+            f.length, f.isDir, f.blockReplication, f.blockSize, f.modificationTime,
+            new Path(f.path)),
+          blockLocations)
+      }
+      (new Path(path), statuses)
+    }
+  }
+  // scalastyle:on parameter.number
+
+  // scalastyle:off parameter.number
+  /**
+   * Lists a single filesystem path recursively. If a `SparkContext` object is specified, this
+   * function may launch Spark jobs to parallelize listing based on `parallelismThreshold`.
+   *
+   * If sessionOpt is None, this may be called on executors.
+   *
+   * @return all children of path that match the specified filter.
+   */
+  private def listLeafFiles(path: Path,
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            contextOpt: Option[SparkContext],
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            isRootPath: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[FileStatus] = {
+
+    logTrace(s"Listing $path")
+    val fs = path.getFileSystem(hadoopConf)
+
+    // Note that statuses only include FileStatus for the files and dirs directly under path,
+    // and does not include anything else recursively.
+    val statuses: Array[FileStatus] = try {
+      fs match {
+        // DistributedFileSystem overrides listLocatedStatus to make 1 single call to namenode
+        // to retrieve the file status with the file block location. The reason to still fallback
+        // to listStatus is because the default implementation would potentially throw a
+        // FileNotFoundException which is better handled by doing the lookups manually below.
+        case (_: DistributedFileSystem | _: ViewFileSystem) if !ignoreLocality =>
+          val remoteIter = fs.listLocatedStatus(path)
+          new Iterator[LocatedFileStatus]() {
+            def next(): LocatedFileStatus = remoteIter.next
+
+            def hasNext(): Boolean = remoteIter.hasNext
+          }.toArray
+        case _ => fs.listStatus(path)
+      }
+    } catch {
+      // If we are listing a root path for SQL (e.g. a top level directory of a table), we need to
+      // ignore FileNotFoundExceptions during this root level of the listing because
+      //
+      //  (a) certain code paths might construct an InMemoryFileIndex with root paths that
+      //      might not exist (i.e. not all callers are guaranteed to have checked
+      //      path existence prior to constructing InMemoryFileIndex) and,
+      //  (b) we need to ignore deleted root paths during REFRESH TABLE, otherwise we break
+      //      existing behavior and break the ability drop SessionCatalog tables when tables'
+      //      root directories have been deleted (which breaks a number of Spark's own tests).
+      //
+      // If we are NOT listing a root path then a FileNotFoundException here means that the
+      // directory was present in a previous level of file listing but is absent in this
+      // listing, likely indicating a race condition (e.g. concurrent table overwrite or S3
+      // list inconsistency).
+      //
+      // The trade-off in supporting existing behaviors / use-cases is that we won't be
+      // able to detect race conditions involving root paths being deleted during
+      // InMemoryFileIndex construction. However, it's still a net improvement to detect and
+      // fail-fast on the non-root cases. For more info see the SPARK-27676 review discussion.
+      case _: FileNotFoundException if isRootPath || ignoreMissingFiles =>
+        logWarning(s"The directory $path was not found. Was it deleted very recently?")
+        Array.empty[FileStatus]
+    }
+
+    val filteredStatuses =
+      statuses.filterNot(status => shouldFilterOutPathName(status.getPath.getName))
+
+    val allLeafStatuses = {
+      val (dirs, topLevelFiles) = filteredStatuses.partition(_.isDirectory)
+      val nestedFiles: Seq[FileStatus] = contextOpt match {
+        case Some(context) if dirs.size > parallelismThreshold =>
+          parallelListLeafFilesInternal(
+            context,
+            dirs.map(_.getPath),
+            hadoopConf = hadoopConf,
+            filter = filter,
+            isRootLevel = false,
+            ignoreMissingFiles = ignoreMissingFiles,
+            ignoreLocality = ignoreLocality,
+            parallelismThreshold = parallelismThreshold,
+            parallelismMax = parallelismMax
+          ).flatMap(_._2)
+        case _ =>
+          dirs.flatMap { dir =>
+            listLeafFiles(
+              path = dir.getPath,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = contextOpt,
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = false,
+              parallelismThreshold = parallelismThreshold,
+              parallelismMax = parallelismMax)
+          }
+      }
+      val allFiles = topLevelFiles ++ nestedFiles
+      if (filter != null) allFiles.filter(f => filter.accept(f.getPath)) else allFiles
+    }
+
+    val missingFiles = mutable.ArrayBuffer.empty[String]
+    val resolvedLeafStatuses = allLeafStatuses.flatMap {
+      case f: LocatedFileStatus =>
+        Some(f)
+
+      // NOTE:
+      //
+      // - Although S3/S3A/S3N file system can be quite slow for remote file metadata
+      //   operations, calling `getFileBlockLocations` does no harm here since these file system
+      //   implementations don't actually issue RPC for this method.
+      //
+      // - Here we are calling `getFileBlockLocations` in a sequential manner, but it should not
+      //   be a big deal since we always use to `parallelListLeafFiles` when the number of
+      //   paths exceeds threshold.
+      case f if !ignoreLocality =>
+        // The other constructor of LocatedFileStatus will call FileStatus.getPermission(),
+        // which is very slow on some file system (RawLocalFileSystem, which is launch a
+        // subprocess and parse the stdout).
+        try {
+          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map { loc =>
+            // Store BlockLocation objects to consume less memory
+            if (loc.getClass == classOf[BlockLocation]) {
+              loc
+            } else {
+              new BlockLocation(loc.getNames, loc.getHosts, loc.getOffset, loc.getLength)
+            }
+          }
+          val lfs = new LocatedFileStatus(f.getLen, f.isDirectory, f.getReplication, f.getBlockSize,
+            f.getModificationTime, 0, null, null, null, null, f.getPath, locations)
+          if (f.isSymlink) {
+            lfs.setSymlink(f.getSymlink)
+          }
+          Some(lfs)
+        } catch {
+          case _: FileNotFoundException if ignoreMissingFiles =>
+            missingFiles += f.getPath.toString
+            None
+        }
+
+      case f => Some(f)
+    }
+
+    if (missingFiles.nonEmpty) {
+      logWarning(
+        s"the following files were missing during file scan:\n  ${missingFiles.mkString("\n  ")}")
+    }
+
+    resolvedLeafStatuses
+  }
+  // scalastyle:on parameter.number
+
+  /** A serializable variant of HDFS's BlockLocation. This is required by Hadoop 2.7. */
+  private case class SerializableBlockLocation(names: Array[String],
+                                               hosts: Array[String],
+                                               offset: Long,
+                                               length: Long)
+
+  /** A serializable variant of HDFS's FileStatus. This is required by Hadoop 2.7. */
+  private case class SerializableFileStatus(path: String,
+                                            length: Long,
+                                            isDir: Boolean,
+                                            blockReplication: Short,
+                                            blockSize: Long,
+                                            modificationTime: Long,
+                                            accessTime: Long,
+                                            blockLocations: Array[SerializableBlockLocation])
+
+  /** Checks if we should filter out this path name. */
+  def shouldFilterOutPathName(pathName: String): Boolean = {

Review comment:
       This is the only thing that changed as compared to Spark's `HadoopFsUtils`

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    if (globPaths.isEmpty) {
+      val partitionDirs = fileIndex.listFiles(partitionFilters, dataFilters)
+      partitionDirs.map(pd => (getPartitionPath(pd.files.head), pd.files)).toMap
+    } else {
+      val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sparkSession, globPaths)
+      val partitionDirs = inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+
+      val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+      val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+      latestBaseFiles.groupBy(getPartitionPath)
+    }
+  }
+
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "true")
+  }

Review comment:
       Correct. There's no reason to disable vectorization. 
   
   Confirmed this with @YannByron 

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    if (globPaths.isEmpty) {
+      val partitionDirs = fileIndex.listFiles(partitionFilters, dataFilters)
+      partitionDirs.map(pd => (getPartitionPath(pd.files.head), pd.files)).toMap
+    } else {
+      val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sparkSession, globPaths)
+      val partitionDirs = inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+
+      val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+      val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+      latestBaseFiles.groupBy(getPartitionPath)
+    }
+  }
+
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "true")
+  }
 }
 
 object HoodieBaseRelation {
 
-  def isMetadataTable(metaClient: HoodieTableMetaClient) =
+  def getPartitionPath(fileStatus: FileStatus): Path =

Review comment:
       In general yes, but in the context it's scoped for (Relation impl), parent of the file -- is the partition path. Or did you have in mind something else?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065804713


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830316472



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/HoodieHadoopFSUtils.scala
##########
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.viewfs.ViewFileSystem
+import org.apache.hadoop.fs._
+import org.apache.hadoop.hdfs.DistributedFileSystem
+import org.apache.spark.internal.Logging
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.util.SerializableConfiguration
+
+import java.io.FileNotFoundException
+import scala.collection.mutable
+
+/**
+ * NOTE: This method class is replica of HadoopFSUtils from Spark 3.2.1, with the following adjustments
+ *
+ *    - Filtering out of the listed files is adjusted to include files starting w/ "." (to include Hoodie Delta Log
+ *    files)
+ */
+object HoodieHadoopFSUtils extends Logging {
+  /**
+   * Lists a collection of paths recursively. Picks the listing strategy adaptively depending
+   * on the number of paths to list.
+   *
+   * This may only be called on the driver.
+   *
+   * @param sc                   Spark context used to run parallel listing.
+   * @param paths                Input paths to list
+   * @param hadoopConf           Hadoop configuration
+   * @param filter               Path filter used to exclude leaf files from result
+   * @param ignoreMissingFiles   Ignore missing files that happen during recursive listing
+   *                             (e.g., due to race conditions)
+   * @param ignoreLocality       Whether to fetch data locality info when listing leaf files. If false,
+   *                             this will return `FileStatus` without `BlockLocation` info.
+   * @param parallelismThreshold The threshold to enable parallelism. If the number of input paths
+   *                             is smaller than this value, this will fallback to use
+   *                             sequential listing.
+   * @param parallelismMax       The maximum parallelism for listing. If the number of input paths is
+   *                             larger than this value, parallelism will be throttled to this value
+   *                             to avoid generating too many tasks.
+   * @return for each input path, the set of discovered files for the path
+   */
+  def parallelListLeafFiles(sc: SparkContext,
+                            paths: Seq[Path],
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+    parallelListLeafFilesInternal(sc, paths, hadoopConf, filter, isRootLevel = true,
+      ignoreMissingFiles, ignoreLocality, parallelismThreshold, parallelismMax)
+  }
+
+  // scalastyle:off parameter.number
+  private def parallelListLeafFilesInternal(sc: SparkContext,
+                                            paths: Seq[Path],
+                                            hadoopConf: Configuration,
+                                            filter: PathFilter,
+                                            isRootLevel: Boolean,
+                                            ignoreMissingFiles: Boolean,
+                                            ignoreLocality: Boolean,
+                                            parallelismThreshold: Int,
+                                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+
+    // Short-circuits parallel listing when serial listing is likely to be faster.
+    if (paths.size <= parallelismThreshold) {
+      // scalastyle:off return
+      return paths.map { path =>
+        val leafFiles = listLeafFiles(
+          path,
+          hadoopConf,
+          filter,
+          Some(sc),
+          ignoreMissingFiles = ignoreMissingFiles,
+          ignoreLocality = ignoreLocality,
+          isRootPath = isRootLevel,
+          parallelismThreshold = parallelismThreshold,
+          parallelismMax = parallelismMax)
+        (path, leafFiles)
+      }
+      // scalastyle:on return
+    }
+
+    logInfo(s"Listing leaf files and directories in parallel under ${paths.length} paths." +
+      s" The first several paths are: ${paths.take(10).mkString(", ")}.")
+    HiveCatalogMetrics.incrementParallelListingJobCount(1)
+
+    val serializableConfiguration = new SerializableConfiguration(hadoopConf)
+    val serializedPaths = paths.map(_.toString)
+
+    // Set the number of parallelism to prevent following file listing from generating many tasks
+    // in case of large #defaultParallelism.
+    val numParallelism = Math.min(paths.size, parallelismMax)
+
+    val previousJobDescription = sc.getLocalProperty(SparkContext.SPARK_JOB_DESCRIPTION)
+    val statusMap = try {
+      val description = paths.size match {
+        case 0 =>
+          "Listing leaf files and directories 0 paths"
+        case 1 =>
+          s"Listing leaf files and directories for 1 path:<br/>${paths(0)}"
+        case s =>
+          s"Listing leaf files and directories for $s paths:<br/>${paths(0)}, ..."
+      }
+      sc.setJobDescription(description)
+      sc
+        .parallelize(serializedPaths, numParallelism)
+        .mapPartitions { pathStrings =>
+          val hadoopConf = serializableConfiguration.value
+          pathStrings.map(new Path(_)).toSeq.map { path =>
+            val leafFiles = listLeafFiles(
+              path = path,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = None, // Can't execute parallel scans on workers
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = isRootLevel,
+              parallelismThreshold = Int.MaxValue,
+              parallelismMax = 0)
+            (path, leafFiles)
+          }.iterator
+        }.map { case (path, statuses) =>
+        val serializableStatuses = statuses.map { status =>
+          // Turn FileStatus into SerializableFileStatus so we can send it back to the driver
+          val blockLocations = status match {
+            case f: LocatedFileStatus =>
+              f.getBlockLocations.map { loc =>
+                SerializableBlockLocation(
+                  loc.getNames,
+                  loc.getHosts,
+                  loc.getOffset,
+                  loc.getLength)
+              }
+
+            case _ =>
+              Array.empty[SerializableBlockLocation]
+          }
+
+          SerializableFileStatus(
+            status.getPath.toString,
+            status.getLen,
+            status.isDirectory,
+            status.getReplication,
+            status.getBlockSize,
+            status.getModificationTime,
+            status.getAccessTime,
+            blockLocations)
+        }
+        (path.toString, serializableStatuses)
+      }.collect()
+    } finally {
+      sc.setJobDescription(previousJobDescription)
+    }
+
+    // turn SerializableFileStatus back to Status
+    statusMap.map { case (path, serializableStatuses) =>
+      val statuses = serializableStatuses.map { f =>
+        val blockLocations = f.blockLocations.map { loc =>
+          new BlockLocation(loc.names, loc.hosts, loc.offset, loc.length)
+        }
+        new LocatedFileStatus(
+          new FileStatus(
+            f.length, f.isDir, f.blockReplication, f.blockSize, f.modificationTime,
+            new Path(f.path)),
+          blockLocations)
+      }
+      (new Path(path), statuses)
+    }
+  }
+  // scalastyle:on parameter.number
+
+  // scalastyle:off parameter.number
+  /**
+   * Lists a single filesystem path recursively. If a `SparkContext` object is specified, this
+   * function may launch Spark jobs to parallelize listing based on `parallelismThreshold`.
+   *
+   * If sessionOpt is None, this may be called on executors.
+   *
+   * @return all children of path that match the specified filter.
+   */
+  private def listLeafFiles(path: Path,
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            contextOpt: Option[SparkContext],
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            isRootPath: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[FileStatus] = {
+
+    logTrace(s"Listing $path")
+    val fs = path.getFileSystem(hadoopConf)
+
+    // Note that statuses only include FileStatus for the files and dirs directly under path,
+    // and does not include anything else recursively.
+    val statuses: Array[FileStatus] = try {
+      fs match {
+        // DistributedFileSystem overrides listLocatedStatus to make 1 single call to namenode
+        // to retrieve the file status with the file block location. The reason to still fallback
+        // to listStatus is because the default implementation would potentially throw a
+        // FileNotFoundException which is better handled by doing the lookups manually below.
+        case (_: DistributedFileSystem | _: ViewFileSystem) if !ignoreLocality =>
+          val remoteIter = fs.listLocatedStatus(path)
+          new Iterator[LocatedFileStatus]() {
+            def next(): LocatedFileStatus = remoteIter.next
+
+            def hasNext(): Boolean = remoteIter.hasNext
+          }.toArray
+        case _ => fs.listStatus(path)
+      }
+    } catch {
+      // If we are listing a root path for SQL (e.g. a top level directory of a table), we need to
+      // ignore FileNotFoundExceptions during this root level of the listing because
+      //
+      //  (a) certain code paths might construct an InMemoryFileIndex with root paths that
+      //      might not exist (i.e. not all callers are guaranteed to have checked
+      //      path existence prior to constructing InMemoryFileIndex) and,
+      //  (b) we need to ignore deleted root paths during REFRESH TABLE, otherwise we break
+      //      existing behavior and break the ability drop SessionCatalog tables when tables'
+      //      root directories have been deleted (which breaks a number of Spark's own tests).
+      //
+      // If we are NOT listing a root path then a FileNotFoundException here means that the
+      // directory was present in a previous level of file listing but is absent in this
+      // listing, likely indicating a race condition (e.g. concurrent table overwrite or S3
+      // list inconsistency).
+      //
+      // The trade-off in supporting existing behaviors / use-cases is that we won't be
+      // able to detect race conditions involving root paths being deleted during
+      // InMemoryFileIndex construction. However, it's still a net improvement to detect and
+      // fail-fast on the non-root cases. For more info see the SPARK-27676 review discussion.
+      case _: FileNotFoundException if isRootPath || ignoreMissingFiles =>
+        logWarning(s"The directory $path was not found. Was it deleted very recently?")
+        Array.empty[FileStatus]
+    }
+
+    val filteredStatuses =
+      statuses.filterNot(status => shouldFilterOutPathName(status.getPath.getName))
+
+    val allLeafStatuses = {
+      val (dirs, topLevelFiles) = filteredStatuses.partition(_.isDirectory)
+      val nestedFiles: Seq[FileStatus] = contextOpt match {
+        case Some(context) if dirs.size > parallelismThreshold =>
+          parallelListLeafFilesInternal(
+            context,
+            dirs.map(_.getPath),
+            hadoopConf = hadoopConf,
+            filter = filter,
+            isRootLevel = false,
+            ignoreMissingFiles = ignoreMissingFiles,
+            ignoreLocality = ignoreLocality,
+            parallelismThreshold = parallelismThreshold,
+            parallelismMax = parallelismMax
+          ).flatMap(_._2)
+        case _ =>
+          dirs.flatMap { dir =>
+            listLeafFiles(
+              path = dir.getPath,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = contextOpt,
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = false,
+              parallelismThreshold = parallelismThreshold,
+              parallelismMax = parallelismMax)
+          }
+      }
+      val allFiles = topLevelFiles ++ nestedFiles
+      if (filter != null) allFiles.filter(f => filter.accept(f.getPath)) else allFiles
+    }
+
+    val missingFiles = mutable.ArrayBuffer.empty[String]
+    val resolvedLeafStatuses = allLeafStatuses.flatMap {
+      case f: LocatedFileStatus =>
+        Some(f)
+
+      // NOTE:
+      //
+      // - Although S3/S3A/S3N file system can be quite slow for remote file metadata
+      //   operations, calling `getFileBlockLocations` does no harm here since these file system
+      //   implementations don't actually issue RPC for this method.
+      //
+      // - Here we are calling `getFileBlockLocations` in a sequential manner, but it should not
+      //   be a big deal since we always use to `parallelListLeafFiles` when the number of
+      //   paths exceeds threshold.
+      case f if !ignoreLocality =>
+        // The other constructor of LocatedFileStatus will call FileStatus.getPermission(),
+        // which is very slow on some file system (RawLocalFileSystem, which is launch a
+        // subprocess and parse the stdout).
+        try {
+          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map { loc =>
+            // Store BlockLocation objects to consume less memory
+            if (loc.getClass == classOf[BlockLocation]) {
+              loc
+            } else {
+              new BlockLocation(loc.getNames, loc.getHosts, loc.getOffset, loc.getLength)
+            }
+          }
+          val lfs = new LocatedFileStatus(f.getLen, f.isDirectory, f.getReplication, f.getBlockSize,
+            f.getModificationTime, 0, null, null, null, null, f.getPath, locations)
+          if (f.isSymlink) {
+            lfs.setSymlink(f.getSymlink)
+          }
+          Some(lfs)
+        } catch {
+          case _: FileNotFoundException if ignoreMissingFiles =>
+            missingFiles += f.getPath.toString
+            None
+        }
+
+      case f => Some(f)
+    }
+
+    if (missingFiles.nonEmpty) {
+      logWarning(
+        s"the following files were missing during file scan:\n  ${missingFiles.mkString("\n  ")}")
+    }
+
+    resolvedLeafStatuses
+  }
+  // scalastyle:on parameter.number
+
+  /** A serializable variant of HDFS's BlockLocation. This is required by Hadoop 2.7. */
+  private case class SerializableBlockLocation(names: Array[String],
+                                               hosts: Array[String],
+                                               offset: Long,
+                                               length: Long)
+
+  /** A serializable variant of HDFS's FileStatus. This is required by Hadoop 2.7. */
+  private case class SerializableFileStatus(path: String,
+                                            length: Long,
+                                            isDir: Boolean,
+                                            blockReplication: Short,
+                                            blockSize: Long,
+                                            modificationTime: Long,
+                                            accessTime: Long,
+                                            blockLocations: Array[SerializableBlockLocation])
+
+  /** Checks if we should filter out this path name. */
+  def shouldFilterOutPathName(pathName: String): Boolean = {
+    // We filter follow paths:
+    // 1. everything that starts with _ and ., except _common_metadata and _metadata
+    // because Parquet needs to find those metadata files from leaf files returned by this method.
+    // We should refactor this logic to not mix metadata files with data files.
+    // 2. everything that ends with `._COPYING_`, because this is a intermediate state of file. we
+    // should skip this file in case of double reading.
+    val exclude = (pathName.startsWith("_") && !pathName.contains("=")) || pathName.endsWith("._COPYING_")
+    val include = pathName.startsWith("_common_metadata") || pathName.startsWith("_metadata")
+    exclude && !include

Review comment:
       Right now this is mostly about filtering out Spark-specific stuff. We can replace it with own utils when there will be a need for it, but for now the goal of borrowing this class was to override its behavior filtering out the files starting with "."

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadIncrementalRelation.scala
##########
@@ -20,65 +20,134 @@ package org.apache.hudi
 import org.apache.hadoop.conf.Configuration
 import org.apache.hadoop.fs.{GlobPattern, Path}
 import org.apache.hudi.HoodieBaseRelation.createBaseFileReader
-import org.apache.hudi.common.model.HoodieRecord
+import org.apache.hudi.HoodieConversionUtils.toScalaOption
+import org.apache.hudi.common.fs.FSUtils.getRelativePartitionPath
+import org.apache.hudi.common.model.{FileSlice, HoodieRecord}
 import org.apache.hudi.common.table.HoodieTableMetaClient
+import org.apache.hudi.common.table.timeline.{HoodieInstant, HoodieTimeline}
 import org.apache.hudi.common.table.view.HoodieTableFileSystemView
+import org.apache.hudi.common.util.StringUtils
 import org.apache.hudi.exception.HoodieException
 import org.apache.hudi.hadoop.utils.HoodieInputFormatUtils.{getCommitMetadata, getWritePartitionPaths, listAffectedFilesForCommits}
-import org.apache.hudi.hadoop.utils.HoodieRealtimeRecordReaderUtils.getMaxCompactionMemoryInBytes
-import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SQLContext}
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.expressions.Expression
 import org.apache.spark.sql.sources._
 import org.apache.spark.sql.types.StructType
 
-import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+import scala.collection.immutable
 
 /**
- * Experimental.
- * Relation, that implements the Hoodie incremental view for Merge On Read table.
- *
+ * @Experimental
  */
 class MergeOnReadIncrementalRelation(sqlContext: SQLContext,

Review comment:
       Yes

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala
##########
@@ -41,43 +43,28 @@ case class HoodieMergeOnReadFileSplit(dataFile: Option[PartitionedFile],
                                       latestCommit: String,
                                       tablePath: String,
                                       maxCompactionMemoryInBytes: Long,
-                                      mergeType: String)
+                                      mergeType: String) extends HoodieFileSplit
 
 class MergeOnReadSnapshotRelation(sqlContext: SQLContext,

Review comment:
       Correct




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830316983



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/execution/datasources/HoodieInMemoryFileIndex.scala
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.execution.datasources
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileStatus, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+import org.apache.spark.HoodieHadoopFSUtils
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+class HoodieInMemoryFileIndex(sparkSession: SparkSession,
+                              rootPathsSpecified: Seq[Path],
+                              parameters: Map[String, String],
+                              userSpecifiedSchema: Option[StructType],
+                              fileStatusCache: FileStatusCache = NoopCache)
+  extends InMemoryFileIndex(sparkSession, rootPathsSpecified, parameters, userSpecifiedSchema, fileStatusCache) {
+
+  /**
+   * List leaf files of given paths. This method will submit a Spark job to do parallel
+   * listing whenever there is a path having more files than the parallel partition discovery threshold.
+   *
+   * This is publicly visible for testing.
+   *
+   * NOTE: This method replicates the one it overrides, however it uses custom method to run parallel
+   *       listing that accepts files starting with "."
+   */
+  override def listLeafFiles(paths: Seq[Path]): mutable.LinkedHashSet[FileStatus] = {
+    val startTime = System.nanoTime()
+    val output = mutable.LinkedHashSet[FileStatus]()
+    val pathsToFetch = mutable.ArrayBuffer[Path]()
+    for (path <- paths) {
+      fileStatusCache.getLeafFiles(path) match {
+        case Some(files) =>
+          HiveCatalogMetrics.incrementFileCacheHits(files.length)
+          output ++= files
+        case None =>
+          pathsToFetch += path
+      }
+      () // for some reasons scalac 2.12 needs this; return type doesn't matter
+    }
+    val filter = FileInputFormat.getInputPathFilter(new JobConf(hadoopConf, this.getClass))
+    val discovered = bulkListLeafFiles(sparkSession, pathsToFetch, filter, hadoopConf)
+
+    discovered.foreach { case (path, leafFiles) =>
+      HiveCatalogMetrics.incrementFilesDiscovered(leafFiles.size)
+      fileStatusCache.putLeafFiles(path, leafFiles.toArray)
+      output ++= leafFiles
+    }
+
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to list leaf files" +
+      s" for ${paths.length} paths.")
+
+    output
+  }
+
+  protected def bulkListLeafFiles(sparkSession: SparkSession, paths: ArrayBuffer[Path], filter: PathFilter, hadoopConf: Configuration): Seq[(Path, Seq[FileStatus])] = {
+    HoodieHadoopFSUtils.parallelListLeafFiles(
+      sc = sparkSession.sparkContext,
+      paths = paths,
+      hadoopConf = hadoopConf,
+      filter = new PathFilterWrapper(filter),
+      ignoreMissingFiles = sparkSession.sessionState.conf.ignoreMissingFiles,
+      // NOTE: We're disabling fetching Block Info to speed up file listing

Review comment:
       Not sure understand your point here: what do you suggest this token to be used for?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r814302378



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hudi.HoodieBaseRelation.createBaseFileReader
+import org.apache.hudi.common.table.HoodieTableMetaClient
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.sources.{BaseRelation, Filter}
+import org.apache.spark.sql.types.StructType
+
+/**
+ * [[BaseRelation]] implementation only reading Base files of Hudi tables, essentially supporting following querying
+ * modes:
+ * <ul>
+ * <li>For COW tables: Snapshot</li>
+ * <li>For MOR tables: Read-optimized</li>
+ * </ul>
+ *
+ * NOTE: The reason this Relation is used in liue of Spark's default [[HadoopFsRelation]] is primarily due to the
+ * fact that it injects real partition's path as the value of the partition field, which Hudi ultimately persists
+ * as part of the record payload. In some cases, however, partition path might not necessarily be equal to the
+ * verbatim value of the partition path field (when custom [[KeyGenerator]] is used) therefore leading to incorrect
+ * partition field values being written
+ */
+class BaseFileOnlyRelation(sqlContext: SQLContext,

Review comment:
       I don't know why GH doesn't detect this as rename, even though git was able to recognize this as one
   
   TL;DR Renamed from `BaseFileOnlyViewRelation`, common part is extracted to `BaseRelation` this now only bears extensions points 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067617927


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069513082


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     }, {
       "hash" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   * 40e5a8537517a19f685367427b00f8a43c3430d8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065975362


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   * 28607fbed4e475b976e4508c00bea4a5551ca45d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067280623


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064717467


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065804713


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067416321


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050332188


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r831474571



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/execution/datasources/HoodieInMemoryFileIndex.scala
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.execution.datasources
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileStatus, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+import org.apache.spark.HoodieHadoopFSUtils
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+class HoodieInMemoryFileIndex(sparkSession: SparkSession,
+                              rootPathsSpecified: Seq[Path],
+                              parameters: Map[String, String],
+                              userSpecifiedSchema: Option[StructType],
+                              fileStatusCache: FileStatusCache = NoopCache)
+  extends InMemoryFileIndex(sparkSession, rootPathsSpecified, parameters, userSpecifiedSchema, fileStatusCache) {
+
+  /**
+   * List leaf files of given paths. This method will submit a Spark job to do parallel
+   * listing whenever there is a path having more files than the parallel partition discovery threshold.
+   *
+   * This is publicly visible for testing.
+   *
+   * NOTE: This method replicates the one it overrides, however it uses custom method to run parallel
+   *       listing that accepts files starting with "."
+   */
+  override def listLeafFiles(paths: Seq[Path]): mutable.LinkedHashSet[FileStatus] = {
+    val startTime = System.nanoTime()
+    val output = mutable.LinkedHashSet[FileStatus]()
+    val pathsToFetch = mutable.ArrayBuffer[Path]()
+    for (path <- paths) {
+      fileStatusCache.getLeafFiles(path) match {
+        case Some(files) =>
+          HiveCatalogMetrics.incrementFileCacheHits(files.length)
+          output ++= files
+        case None =>
+          pathsToFetch += path
+      }
+      () // for some reasons scalac 2.12 needs this; return type doesn't matter
+    }
+    val filter = FileInputFormat.getInputPathFilter(new JobConf(hadoopConf, this.getClass))
+    val discovered = bulkListLeafFiles(sparkSession, pathsToFetch, filter, hadoopConf)
+
+    discovered.foreach { case (path, leafFiles) =>
+      HiveCatalogMetrics.incrementFilesDiscovered(leafFiles.size)
+      fileStatusCache.putLeafFiles(path, leafFiles.toArray)
+      output ++= leafFiles
+    }
+
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to list leaf files" +
+      s" for ${paths.length} paths.")
+
+    output
+  }
+
+  protected def bulkListLeafFiles(sparkSession: SparkSession, paths: ArrayBuffer[Path], filter: PathFilter, hadoopConf: Configuration): Seq[(Path, Seq[FileStatus])] = {
+    HoodieHadoopFSUtils.parallelListLeafFiles(
+      sc = sparkSession.sparkContext,
+      paths = paths,
+      hadoopConf = hadoopConf,
+      filter = new PathFilterWrapper(filter),
+      ignoreMissingFiles = sparkSession.sessionState.conf.ignoreMissingFiles,
+      // NOTE: We're disabling fetching Block Info to speed up file listing

Review comment:
       Gotcha. It's gonna be tough to identify all such places with markers, instead i'm referencing respective Spark release version this is borrowed from so that we can simply diff against it and see what has changed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830227161



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -67,41 +49,6 @@ public static boolean doesBelongToIncrementalQuery(FileSplit s) {
     return false;
   }
 
-  // Return parquet file with a list of log files in the same file group.
-  public static List<Pair<Option<HoodieBaseFile>, List<HoodieLogFile>>> groupLogsByBaseFile(Configuration conf, List<Path> partitionPaths) {

Review comment:
       this not used or moved?

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadIncrementalRelation.scala
##########
@@ -20,65 +20,134 @@ package org.apache.hudi
 import org.apache.hadoop.conf.Configuration
 import org.apache.hadoop.fs.{GlobPattern, Path}
 import org.apache.hudi.HoodieBaseRelation.createBaseFileReader
-import org.apache.hudi.common.model.HoodieRecord
+import org.apache.hudi.HoodieConversionUtils.toScalaOption
+import org.apache.hudi.common.fs.FSUtils.getRelativePartitionPath
+import org.apache.hudi.common.model.{FileSlice, HoodieRecord}
 import org.apache.hudi.common.table.HoodieTableMetaClient
+import org.apache.hudi.common.table.timeline.{HoodieInstant, HoodieTimeline}
 import org.apache.hudi.common.table.view.HoodieTableFileSystemView
+import org.apache.hudi.common.util.StringUtils
 import org.apache.hudi.exception.HoodieException
 import org.apache.hudi.hadoop.utils.HoodieInputFormatUtils.{getCommitMetadata, getWritePartitionPaths, listAffectedFilesForCommits}
-import org.apache.hudi.hadoop.utils.HoodieRealtimeRecordReaderUtils.getMaxCompactionMemoryInBytes
-import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SQLContext}
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.expressions.Expression
 import org.apache.spark.sql.sources._
 import org.apache.spark.sql.types.StructType
 
-import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+import scala.collection.immutable
 
 /**
- * Experimental.
- * Relation, that implements the Hoodie incremental view for Merge On Read table.
- *
+ * @Experimental
  */
 class MergeOnReadIncrementalRelation(sqlContext: SQLContext,

Review comment:
       removed lines in this class are mostly due to common logic in HoodieBaseRelation? hard to tell from the diff

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/HoodieHadoopFSUtils.scala
##########
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.viewfs.ViewFileSystem
+import org.apache.hadoop.fs._
+import org.apache.hadoop.hdfs.DistributedFileSystem
+import org.apache.spark.internal.Logging
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.util.SerializableConfiguration
+
+import java.io.FileNotFoundException
+import scala.collection.mutable
+
+/**
+ * NOTE: This method class is replica of HadoopFSUtils from Spark 3.2.1, with the following adjustments
+ *
+ *    - Filtering out of the listed files is adjusted to include files starting w/ "." (to include Hoodie Delta Log
+ *    files)
+ */
+object HoodieHadoopFSUtils extends Logging {
+  /**
+   * Lists a collection of paths recursively. Picks the listing strategy adaptively depending
+   * on the number of paths to list.
+   *
+   * This may only be called on the driver.
+   *
+   * @param sc                   Spark context used to run parallel listing.
+   * @param paths                Input paths to list
+   * @param hadoopConf           Hadoop configuration
+   * @param filter               Path filter used to exclude leaf files from result
+   * @param ignoreMissingFiles   Ignore missing files that happen during recursive listing
+   *                             (e.g., due to race conditions)
+   * @param ignoreLocality       Whether to fetch data locality info when listing leaf files. If false,
+   *                             this will return `FileStatus` without `BlockLocation` info.
+   * @param parallelismThreshold The threshold to enable parallelism. If the number of input paths
+   *                             is smaller than this value, this will fallback to use
+   *                             sequential listing.
+   * @param parallelismMax       The maximum parallelism for listing. If the number of input paths is
+   *                             larger than this value, parallelism will be throttled to this value
+   *                             to avoid generating too many tasks.
+   * @return for each input path, the set of discovered files for the path
+   */
+  def parallelListLeafFiles(sc: SparkContext,
+                            paths: Seq[Path],
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+    parallelListLeafFilesInternal(sc, paths, hadoopConf, filter, isRootLevel = true,
+      ignoreMissingFiles, ignoreLocality, parallelismThreshold, parallelismMax)
+  }
+
+  // scalastyle:off parameter.number
+  private def parallelListLeafFilesInternal(sc: SparkContext,
+                                            paths: Seq[Path],
+                                            hadoopConf: Configuration,
+                                            filter: PathFilter,
+                                            isRootLevel: Boolean,
+                                            ignoreMissingFiles: Boolean,
+                                            ignoreLocality: Boolean,
+                                            parallelismThreshold: Int,
+                                            parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = {
+
+    // Short-circuits parallel listing when serial listing is likely to be faster.
+    if (paths.size <= parallelismThreshold) {
+      // scalastyle:off return
+      return paths.map { path =>
+        val leafFiles = listLeafFiles(
+          path,
+          hadoopConf,
+          filter,
+          Some(sc),
+          ignoreMissingFiles = ignoreMissingFiles,
+          ignoreLocality = ignoreLocality,
+          isRootPath = isRootLevel,
+          parallelismThreshold = parallelismThreshold,
+          parallelismMax = parallelismMax)
+        (path, leafFiles)
+      }
+      // scalastyle:on return
+    }
+
+    logInfo(s"Listing leaf files and directories in parallel under ${paths.length} paths." +
+      s" The first several paths are: ${paths.take(10).mkString(", ")}.")
+    HiveCatalogMetrics.incrementParallelListingJobCount(1)
+
+    val serializableConfiguration = new SerializableConfiguration(hadoopConf)
+    val serializedPaths = paths.map(_.toString)
+
+    // Set the number of parallelism to prevent following file listing from generating many tasks
+    // in case of large #defaultParallelism.
+    val numParallelism = Math.min(paths.size, parallelismMax)
+
+    val previousJobDescription = sc.getLocalProperty(SparkContext.SPARK_JOB_DESCRIPTION)
+    val statusMap = try {
+      val description = paths.size match {
+        case 0 =>
+          "Listing leaf files and directories 0 paths"
+        case 1 =>
+          s"Listing leaf files and directories for 1 path:<br/>${paths(0)}"
+        case s =>
+          s"Listing leaf files and directories for $s paths:<br/>${paths(0)}, ..."
+      }
+      sc.setJobDescription(description)
+      sc
+        .parallelize(serializedPaths, numParallelism)
+        .mapPartitions { pathStrings =>
+          val hadoopConf = serializableConfiguration.value
+          pathStrings.map(new Path(_)).toSeq.map { path =>
+            val leafFiles = listLeafFiles(
+              path = path,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = None, // Can't execute parallel scans on workers
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = isRootLevel,
+              parallelismThreshold = Int.MaxValue,
+              parallelismMax = 0)
+            (path, leafFiles)
+          }.iterator
+        }.map { case (path, statuses) =>
+        val serializableStatuses = statuses.map { status =>
+          // Turn FileStatus into SerializableFileStatus so we can send it back to the driver
+          val blockLocations = status match {
+            case f: LocatedFileStatus =>
+              f.getBlockLocations.map { loc =>
+                SerializableBlockLocation(
+                  loc.getNames,
+                  loc.getHosts,
+                  loc.getOffset,
+                  loc.getLength)
+              }
+
+            case _ =>
+              Array.empty[SerializableBlockLocation]
+          }
+
+          SerializableFileStatus(
+            status.getPath.toString,
+            status.getLen,
+            status.isDirectory,
+            status.getReplication,
+            status.getBlockSize,
+            status.getModificationTime,
+            status.getAccessTime,
+            blockLocations)
+        }
+        (path.toString, serializableStatuses)
+      }.collect()
+    } finally {
+      sc.setJobDescription(previousJobDescription)
+    }
+
+    // turn SerializableFileStatus back to Status
+    statusMap.map { case (path, serializableStatuses) =>
+      val statuses = serializableStatuses.map { f =>
+        val blockLocations = f.blockLocations.map { loc =>
+          new BlockLocation(loc.names, loc.hosts, loc.offset, loc.length)
+        }
+        new LocatedFileStatus(
+          new FileStatus(
+            f.length, f.isDir, f.blockReplication, f.blockSize, f.modificationTime,
+            new Path(f.path)),
+          blockLocations)
+      }
+      (new Path(path), statuses)
+    }
+  }
+  // scalastyle:on parameter.number
+
+  // scalastyle:off parameter.number
+  /**
+   * Lists a single filesystem path recursively. If a `SparkContext` object is specified, this
+   * function may launch Spark jobs to parallelize listing based on `parallelismThreshold`.
+   *
+   * If sessionOpt is None, this may be called on executors.
+   *
+   * @return all children of path that match the specified filter.
+   */
+  private def listLeafFiles(path: Path,
+                            hadoopConf: Configuration,
+                            filter: PathFilter,
+                            contextOpt: Option[SparkContext],
+                            ignoreMissingFiles: Boolean,
+                            ignoreLocality: Boolean,
+                            isRootPath: Boolean,
+                            parallelismThreshold: Int,
+                            parallelismMax: Int): Seq[FileStatus] = {
+
+    logTrace(s"Listing $path")
+    val fs = path.getFileSystem(hadoopConf)
+
+    // Note that statuses only include FileStatus for the files and dirs directly under path,
+    // and does not include anything else recursively.
+    val statuses: Array[FileStatus] = try {
+      fs match {
+        // DistributedFileSystem overrides listLocatedStatus to make 1 single call to namenode
+        // to retrieve the file status with the file block location. The reason to still fallback
+        // to listStatus is because the default implementation would potentially throw a
+        // FileNotFoundException which is better handled by doing the lookups manually below.
+        case (_: DistributedFileSystem | _: ViewFileSystem) if !ignoreLocality =>
+          val remoteIter = fs.listLocatedStatus(path)
+          new Iterator[LocatedFileStatus]() {
+            def next(): LocatedFileStatus = remoteIter.next
+
+            def hasNext(): Boolean = remoteIter.hasNext
+          }.toArray
+        case _ => fs.listStatus(path)
+      }
+    } catch {
+      // If we are listing a root path for SQL (e.g. a top level directory of a table), we need to
+      // ignore FileNotFoundExceptions during this root level of the listing because
+      //
+      //  (a) certain code paths might construct an InMemoryFileIndex with root paths that
+      //      might not exist (i.e. not all callers are guaranteed to have checked
+      //      path existence prior to constructing InMemoryFileIndex) and,
+      //  (b) we need to ignore deleted root paths during REFRESH TABLE, otherwise we break
+      //      existing behavior and break the ability drop SessionCatalog tables when tables'
+      //      root directories have been deleted (which breaks a number of Spark's own tests).
+      //
+      // If we are NOT listing a root path then a FileNotFoundException here means that the
+      // directory was present in a previous level of file listing but is absent in this
+      // listing, likely indicating a race condition (e.g. concurrent table overwrite or S3
+      // list inconsistency).
+      //
+      // The trade-off in supporting existing behaviors / use-cases is that we won't be
+      // able to detect race conditions involving root paths being deleted during
+      // InMemoryFileIndex construction. However, it's still a net improvement to detect and
+      // fail-fast on the non-root cases. For more info see the SPARK-27676 review discussion.
+      case _: FileNotFoundException if isRootPath || ignoreMissingFiles =>
+        logWarning(s"The directory $path was not found. Was it deleted very recently?")
+        Array.empty[FileStatus]
+    }
+
+    val filteredStatuses =
+      statuses.filterNot(status => shouldFilterOutPathName(status.getPath.getName))
+
+    val allLeafStatuses = {
+      val (dirs, topLevelFiles) = filteredStatuses.partition(_.isDirectory)
+      val nestedFiles: Seq[FileStatus] = contextOpt match {
+        case Some(context) if dirs.size > parallelismThreshold =>
+          parallelListLeafFilesInternal(
+            context,
+            dirs.map(_.getPath),
+            hadoopConf = hadoopConf,
+            filter = filter,
+            isRootLevel = false,
+            ignoreMissingFiles = ignoreMissingFiles,
+            ignoreLocality = ignoreLocality,
+            parallelismThreshold = parallelismThreshold,
+            parallelismMax = parallelismMax
+          ).flatMap(_._2)
+        case _ =>
+          dirs.flatMap { dir =>
+            listLeafFiles(
+              path = dir.getPath,
+              hadoopConf = hadoopConf,
+              filter = filter,
+              contextOpt = contextOpt,
+              ignoreMissingFiles = ignoreMissingFiles,
+              ignoreLocality = ignoreLocality,
+              isRootPath = false,
+              parallelismThreshold = parallelismThreshold,
+              parallelismMax = parallelismMax)
+          }
+      }
+      val allFiles = topLevelFiles ++ nestedFiles
+      if (filter != null) allFiles.filter(f => filter.accept(f.getPath)) else allFiles
+    }
+
+    val missingFiles = mutable.ArrayBuffer.empty[String]
+    val resolvedLeafStatuses = allLeafStatuses.flatMap {
+      case f: LocatedFileStatus =>
+        Some(f)
+
+      // NOTE:
+      //
+      // - Although S3/S3A/S3N file system can be quite slow for remote file metadata
+      //   operations, calling `getFileBlockLocations` does no harm here since these file system
+      //   implementations don't actually issue RPC for this method.
+      //
+      // - Here we are calling `getFileBlockLocations` in a sequential manner, but it should not
+      //   be a big deal since we always use to `parallelListLeafFiles` when the number of
+      //   paths exceeds threshold.
+      case f if !ignoreLocality =>
+        // The other constructor of LocatedFileStatus will call FileStatus.getPermission(),
+        // which is very slow on some file system (RawLocalFileSystem, which is launch a
+        // subprocess and parse the stdout).
+        try {
+          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map { loc =>
+            // Store BlockLocation objects to consume less memory
+            if (loc.getClass == classOf[BlockLocation]) {
+              loc
+            } else {
+              new BlockLocation(loc.getNames, loc.getHosts, loc.getOffset, loc.getLength)
+            }
+          }
+          val lfs = new LocatedFileStatus(f.getLen, f.isDirectory, f.getReplication, f.getBlockSize,
+            f.getModificationTime, 0, null, null, null, null, f.getPath, locations)
+          if (f.isSymlink) {
+            lfs.setSymlink(f.getSymlink)
+          }
+          Some(lfs)
+        } catch {
+          case _: FileNotFoundException if ignoreMissingFiles =>
+            missingFiles += f.getPath.toString
+            None
+        }
+
+      case f => Some(f)
+    }
+
+    if (missingFiles.nonEmpty) {
+      logWarning(
+        s"the following files were missing during file scan:\n  ${missingFiles.mkString("\n  ")}")
+    }
+
+    resolvedLeafStatuses
+  }
+  // scalastyle:on parameter.number
+
+  /** A serializable variant of HDFS's BlockLocation. This is required by Hadoop 2.7. */
+  private case class SerializableBlockLocation(names: Array[String],
+                                               hosts: Array[String],
+                                               offset: Long,
+                                               length: Long)
+
+  /** A serializable variant of HDFS's FileStatus. This is required by Hadoop 2.7. */
+  private case class SerializableFileStatus(path: String,
+                                            length: Long,
+                                            isDir: Boolean,
+                                            blockReplication: Short,
+                                            blockSize: Long,
+                                            modificationTime: Long,
+                                            accessTime: Long,
+                                            blockLocations: Array[SerializableBlockLocation])
+
+  /** Checks if we should filter out this path name. */
+  def shouldFilterOutPathName(pathName: String): Boolean = {
+    // We filter follow paths:
+    // 1. everything that starts with _ and ., except _common_metadata and _metadata
+    // because Parquet needs to find those metadata files from leaf files returned by this method.
+    // We should refactor this logic to not mix metadata files with data files.
+    // 2. everything that ends with `._COPYING_`, because this is a intermediate state of file. we
+    // should skip this file in case of double reading.
+    val exclude = (pathName.startsWith("_") && !pathName.contains("=")) || pathName.endsWith("._COPYING_")
+    val include = pathName.startsWith("_common_metadata") || pathName.startsWith("_metadata")
+    exclude && !include

Review comment:
       should some utils from MDT be the source of truth to these rules instead? spark side does not own these, also can avoid copying it over different spark versions

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala
##########
@@ -41,43 +43,28 @@ case class HoodieMergeOnReadFileSplit(dataFile: Option[PartitionedFile],
                                       latestCommit: String,
                                       tablePath: String,
                                       maxCompactionMemoryInBytes: Long,
-                                      mergeType: String)
+                                      mergeType: String) extends HoodieFileSplit
 
 class MergeOnReadSnapshotRelation(sqlContext: SQLContext,

Review comment:
       same question as above. hard to exam line by line what's extracted out. but overall direction looks good.

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/execution/datasources/HoodieInMemoryFileIndex.scala
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.execution.datasources
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileStatus, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+import org.apache.spark.HoodieHadoopFSUtils
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+class HoodieInMemoryFileIndex(sparkSession: SparkSession,
+                              rootPathsSpecified: Seq[Path],
+                              parameters: Map[String, String],
+                              userSpecifiedSchema: Option[StructType],
+                              fileStatusCache: FileStatusCache = NoopCache)
+  extends InMemoryFileIndex(sparkSession, rootPathsSpecified, parameters, userSpecifiedSchema, fileStatusCache) {
+
+  /**
+   * List leaf files of given paths. This method will submit a Spark job to do parallel
+   * listing whenever there is a path having more files than the parallel partition discovery threshold.
+   *
+   * This is publicly visible for testing.
+   *
+   * NOTE: This method replicates the one it overrides, however it uses custom method to run parallel
+   *       listing that accepts files starting with "."
+   */
+  override def listLeafFiles(paths: Seq[Path]): mutable.LinkedHashSet[FileStatus] = {
+    val startTime = System.nanoTime()
+    val output = mutable.LinkedHashSet[FileStatus]()
+    val pathsToFetch = mutable.ArrayBuffer[Path]()
+    for (path <- paths) {
+      fileStatusCache.getLeafFiles(path) match {
+        case Some(files) =>
+          HiveCatalogMetrics.incrementFileCacheHits(files.length)
+          output ++= files
+        case None =>
+          pathsToFetch += path
+      }
+      () // for some reasons scalac 2.12 needs this; return type doesn't matter
+    }
+    val filter = FileInputFormat.getInputPathFilter(new JobConf(hadoopConf, this.getClass))
+    val discovered = bulkListLeafFiles(sparkSession, pathsToFetch, filter, hadoopConf)
+
+    discovered.foreach { case (path, leafFiles) =>
+      HiveCatalogMetrics.incrementFilesDiscovered(leafFiles.size)
+      fileStatusCache.putLeafFiles(path, leafFiles.toArray)
+      output ++= leafFiles
+    }
+
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to list leaf files" +
+      s" for ${paths.length} paths.")
+
+    output
+  }
+
+  protected def bulkListLeafFiles(sparkSession: SparkSession, paths: ArrayBuffer[Path], filter: PathFilter, hadoopConf: Configuration): Seq[(Path, Seq[FileStatus])] = {
+    HoodieHadoopFSUtils.parallelListLeafFiles(
+      sc = sparkSession.sparkContext,
+      paths = paths,
+      hadoopConf = hadoopConf,
+      filter = new PathFilterWrapper(filter),
+      ignoreMissingFiles = sparkSession.sessionState.conf.ignoreMissingFiles,
+      // NOTE: We're disabling fetching Block Info to speed up file listing

Review comment:
       we may need a special token here to indicate changed part in hudi's codebase for easier maintenance. `// NOTE:` is not special enough. what about `// HUDI NOTE:` ? this can apply to any other incoming code variation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068541450


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068702943


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830443165



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/execution/datasources/HoodieInMemoryFileIndex.scala
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.execution.datasources
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileStatus, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+import org.apache.spark.HoodieHadoopFSUtils
+import org.apache.spark.metrics.source.HiveCatalogMetrics
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+class HoodieInMemoryFileIndex(sparkSession: SparkSession,
+                              rootPathsSpecified: Seq[Path],
+                              parameters: Map[String, String],
+                              userSpecifiedSchema: Option[StructType],
+                              fileStatusCache: FileStatusCache = NoopCache)
+  extends InMemoryFileIndex(sparkSession, rootPathsSpecified, parameters, userSpecifiedSchema, fileStatusCache) {
+
+  /**
+   * List leaf files of given paths. This method will submit a Spark job to do parallel
+   * listing whenever there is a path having more files than the parallel partition discovery threshold.
+   *
+   * This is publicly visible for testing.
+   *
+   * NOTE: This method replicates the one it overrides, however it uses custom method to run parallel
+   *       listing that accepts files starting with "."
+   */
+  override def listLeafFiles(paths: Seq[Path]): mutable.LinkedHashSet[FileStatus] = {
+    val startTime = System.nanoTime()
+    val output = mutable.LinkedHashSet[FileStatus]()
+    val pathsToFetch = mutable.ArrayBuffer[Path]()
+    for (path <- paths) {
+      fileStatusCache.getLeafFiles(path) match {
+        case Some(files) =>
+          HiveCatalogMetrics.incrementFileCacheHits(files.length)
+          output ++= files
+        case None =>
+          pathsToFetch += path
+      }
+      () // for some reasons scalac 2.12 needs this; return type doesn't matter
+    }
+    val filter = FileInputFormat.getInputPathFilter(new JobConf(hadoopConf, this.getClass))
+    val discovered = bulkListLeafFiles(sparkSession, pathsToFetch, filter, hadoopConf)
+
+    discovered.foreach { case (path, leafFiles) =>
+      HiveCatalogMetrics.incrementFilesDiscovered(leafFiles.size)
+      fileStatusCache.putLeafFiles(path, leafFiles.toArray)
+      output ++= leafFiles
+    }
+
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to list leaf files" +
+      s" for ${paths.length} paths.")
+
+    output
+  }
+
+  protected def bulkListLeafFiles(sparkSession: SparkSession, paths: ArrayBuffer[Path], filter: PathFilter, hadoopConf: Configuration): Seq[(Path, Seq[FileStatus])] = {
+    HoodieHadoopFSUtils.parallelListLeafFiles(
+      sc = sparkSession.sparkContext,
+      paths = paths,
+      hadoopConf = hadoopConf,
+      filter = new PathFilterWrapper(filter),
+      ignoreMissingFiles = sparkSession.sessionState.conf.ignoreMissingFiles,
+      // NOTE: We're disabling fetching Block Info to speed up file listing

Review comment:
       I meant to say when we want to understand which part of code is modified in Hudi, we may search for a special token and find the relevant code. `NOTE:` might come from the original code base so wanted to make it special.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068661514


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048425436


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050289363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048424450


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065838775


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067280623


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050332188


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050286958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050286958


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048424450


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065984525


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065784011


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065838316


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067472548


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   * 920d8e63ac2343edc09a55f35658343eaac613df UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067321388


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067285455


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067285455


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067416321


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r826534174



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    if (globPaths.isEmpty) {
+      val partitionDirs = fileIndex.listFiles(partitionFilters, dataFilters)
+      partitionDirs.map(pd => (getPartitionPath(pd.files.head), pd.files)).toMap
+    } else {
+      val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sparkSession, globPaths)
+      val partitionDirs = inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+
+      val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+      val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+      latestBaseFiles.groupBy(getPartitionPath)
+    }
+  }
+
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "true")
+  }

Review comment:
       yep. `enableVectorizedReader` was false. but as discussed with @alexeykudinkin before, we need to enable this to speed up.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069352442


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069349900


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067327600


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067484306


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067513941


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068752201


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064716174


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065784011


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065975770


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r824682968



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hudi.HoodieBaseRelation.createBaseFileReader
+import org.apache.hudi.common.table.HoodieTableMetaClient
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.sources.{BaseRelation, Filter}
+import org.apache.spark.sql.types.StructType
+
+/**
+ * [[BaseRelation]] implementation only reading Base files of Hudi tables, essentially supporting following querying
+ * modes:
+ * <ul>
+ * <li>For COW tables: Snapshot</li>
+ * <li>For MOR tables: Read-optimized</li>
+ * </ul>
+ *
+ * NOTE: The reason this Relation is used in liue of Spark's default [[HadoopFsRelation]] is primarily due to the
+ * fact that it injects real partition's path as the value of the partition field, which Hudi ultimately persists
+ * as part of the record payload. In some cases, however, partition path might not necessarily be equal to the
+ * verbatim value of the partition path field (when custom [[KeyGenerator]] is used) therefore leading to incorrect
+ * partition field values being written
+ */
+class BaseFileOnlyRelation(sqlContext: SQLContext,

Review comment:
       if you rename first then commit then make 2nd commit for the changes, it should detect as renaming. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065838316


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065838775


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067288244


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067472548


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   * 920d8e63ac2343edc09a55f35658343eaac613df UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069354989


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067278341


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067341603


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068703912


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067572889


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830416222



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +159,129 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    if (fileSplits.nonEmpty)
+      composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
+    else
+      sparkSession.sparkContext.emptyRDD
+  }
+
+  /**
+   * Composes RDD provided file splits to read from, table and partition schemas, data filters to be applied
+   *
+   * @param fileSplits      file splits to be handled by the RDD
+   * @param partitionSchema target table's partition schema
+   * @param tableSchema     target table's schema
+   * @param requiredSchema  projected schema required by the reader
+   * @param filters         data filters to be applied
+   * @return instance of RDD (implementing [[HoodieUnsafeRDD]])
+   */
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  /**
+   * Provided with partition and date filters collects target file splits to read records from, while
+   * performing pruning if necessary
+   *
+   * @param partitionFilters partition filters to be applied
+   * @param dataFilters data filters to be applied
+   * @return list of [[FileSplit]] to fetch records from
+   */
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globbedPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    val partitionDirs = if (globbedPaths.isEmpty) {
+      fileIndex.listFiles(partitionFilters, dataFilters)
+    } else {
+      val inMemoryFileIndex = HoodieInMemoryFileIndex.create(sparkSession, globbedPaths)
+      inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+    }
+
+    val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+    val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+    latestBaseFiles.groupBy(getPartitionPath)
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    // TODO(HUDI-3639) vectorized reader has to be disabled to make sure MORIncrementalRelation is working properly
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "false")
+  }

Review comment:
       Please take a look at the TODO note i've added to it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1048455568


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050324848


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064717467


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284) 
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065837783


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064793172


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068616068


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067484306


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067341603


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065984525


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067288244


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067321388


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050324848


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050326745


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   * d875e412abc29bf6a0e8a6fa7bef747ded15d60b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069404738


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r824767545



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieDataSourceHelper.scala
##########
@@ -65,20 +65,6 @@ object HoodieDataSourceHelper extends PredicateHelper {
     }
   }
 
-  /**
-   * Extract the required schema from [[InternalRow]]
-   */
-  def extractRequiredSchema(

Review comment:
       this not used?

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    if (globPaths.isEmpty) {
+      val partitionDirs = fileIndex.listFiles(partitionFilters, dataFilters)
+      partitionDirs.map(pd => (getPartitionPath(pd.files.head), pd.files)).toMap
+    } else {
+      val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sparkSession, globPaths)
+      val partitionDirs = inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+
+      val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+      val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+      latestBaseFiles.groupBy(getPartitionPath)
+    }
+  }
+
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "true")
+  }

Review comment:
       this one was false

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc

Review comment:
       add now?

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  // TODO scala-doc
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    if (globPaths.isEmpty) {
+      val partitionDirs = fileIndex.listFiles(partitionFilters, dataFilters)
+      partitionDirs.map(pd => (getPartitionPath(pd.files.head), pd.files)).toMap
+    } else {
+      val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sparkSession, globPaths)
+      val partitionDirs = inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+
+      val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+      val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+      latestBaseFiles.groupBy(getPartitionPath)
+    }
+  }
+
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "true")
+  }
 }
 
 object HoodieBaseRelation {
 
-  def isMetadataTable(metaClient: HoodieTableMetaClient) =
+  def getPartitionPath(fileStatus: FileStatus): Path =

Review comment:
       better call `getParentPath` ? `getParent` does not equal to partition path right, depends on input

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkHoodieTableFileIndex.scala
##########
@@ -105,13 +104,15 @@ class SparkHoodieTableFileIndex(spark: SparkSession,
    * Fetch list of latest base files w/ corresponding log files, after performing
    * partition pruning
    *
+   * TODO unify w/ HoodieFileIndex#listFiles
+   *
    * @param partitionFilters partition column filters
    * @return mapping from string partition paths to its base/log files
    */
   def listFileSlices(partitionFilters: Seq[Expression]): Map[String, Seq[FileSlice]] = {
     // Prune the partition path by the partition filters
     val prunedPartitions = HoodieCommonUtils.prunePartition(partitionSchema,
-      cachedAllInputFileSlices.asScala.keys.toSeq, partitionFilters)

Review comment:
       👍 

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +158,110 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  // TODO scala-doc

Review comment:
       add now?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830416222



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +159,129 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    if (fileSplits.nonEmpty)
+      composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
+    else
+      sparkSession.sparkContext.emptyRDD
+  }
+
+  /**
+   * Composes RDD provided file splits to read from, table and partition schemas, data filters to be applied
+   *
+   * @param fileSplits      file splits to be handled by the RDD
+   * @param partitionSchema target table's partition schema
+   * @param tableSchema     target table's schema
+   * @param requiredSchema  projected schema required by the reader
+   * @param filters         data filters to be applied
+   * @return instance of RDD (implementing [[HoodieUnsafeRDD]])
+   */
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  /**
+   * Provided with partition and date filters collects target file splits to read records from, while
+   * performing pruning if necessary
+   *
+   * @param partitionFilters partition filters to be applied
+   * @param dataFilters data filters to be applied
+   * @return list of [[FileSplit]] to fetch records from
+   */
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globbedPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    val partitionDirs = if (globbedPaths.isEmpty) {
+      fileIndex.listFiles(partitionFilters, dataFilters)
+    } else {
+      val inMemoryFileIndex = HoodieInMemoryFileIndex.create(sparkSession, globbedPaths)
+      inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+    }
+
+    val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+    val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+    latestBaseFiles.groupBy(getPartitionPath)
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    // TODO(HUDI-3639) vectorized reader has to be disabled to make sure MORIncrementalRelation is working properly
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "false")
+  }

Review comment:
       Please take a look at the TODO note i've added to it. We can't do that b/c MOR Incremental Relation relies on Parquet Filtering which doesn't work w/ vectorized reader




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] XuQianJin-Stars commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830415829



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -130,22 +159,129 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * NOTE: DO NOT OVERRIDE THIS METHOD
    */
   override final def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = {
+    // NOTE: In case list of requested columns doesn't contain the Primary Key one, we
+    //       have to add it explicitly so that
+    //          - Merging could be performed correctly
+    //          - In case 0 columns are to be fetched (for ex, when doing {@code count()} on Spark's [[Dataset]],
+    //          Spark still fetches all the rows to execute the query correctly
+    //
+    //       It's okay to return columns that have not been requested by the caller, as those nevertheless will be
+    //       filtered out upstream
+    val fetchedColumns: Array[String] = appendMandatoryColumns(requiredColumns)
+
+    val (requiredAvroSchema, requiredStructSchema) =
+      HoodieSparkUtils.getRequiredSchema(tableAvroSchema, fetchedColumns)
+
+    val filterExpressions = convertToExpressions(filters)
+    val (partitionFilters, dataFilters) = filterExpressions.partition(isPartitionPredicate)
+
+    val fileSplits = collectFileSplits(partitionFilters, dataFilters)
+
+    val partitionSchema = StructType(Nil)
+    val tableSchema = HoodieTableSchema(tableStructSchema, tableAvroSchema.toString)
+    val requiredSchema = HoodieTableSchema(requiredStructSchema, requiredAvroSchema.toString)
+
     // Here we rely on a type erasure, to workaround inherited API restriction and pass [[RDD[InternalRow]]] back as [[RDD[Row]]]
     // Please check [[needConversion]] scala-doc for more details
-    doBuildScan(requiredColumns, filters).asInstanceOf[RDD[Row]]
+    if (fileSplits.nonEmpty)
+      composeRDD(fileSplits, partitionSchema, tableSchema, requiredSchema, filters).asInstanceOf[RDD[Row]]
+    else
+      sparkSession.sparkContext.emptyRDD
+  }
+
+  /**
+   * Composes RDD provided file splits to read from, table and partition schemas, data filters to be applied
+   *
+   * @param fileSplits      file splits to be handled by the RDD
+   * @param partitionSchema target table's partition schema
+   * @param tableSchema     target table's schema
+   * @param requiredSchema  projected schema required by the reader
+   * @param filters         data filters to be applied
+   * @return instance of RDD (implementing [[HoodieUnsafeRDD]])
+   */
+  protected def composeRDD(fileSplits: Seq[FileSplit],
+                           partitionSchema: StructType,
+                           tableSchema: HoodieTableSchema,
+                           requiredSchema: HoodieTableSchema,
+                           filters: Array[Filter]): HoodieUnsafeRDD
+
+  /**
+   * Provided with partition and date filters collects target file splits to read records from, while
+   * performing pruning if necessary
+   *
+   * @param partitionFilters partition filters to be applied
+   * @param dataFilters data filters to be applied
+   * @return list of [[FileSplit]] to fetch records from
+   */
+  protected def collectFileSplits(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[FileSplit]
+
+  protected def listLatestBaseFiles(globbedPaths: Seq[Path], partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Map[Path, Seq[FileStatus]] = {
+    val partitionDirs = if (globbedPaths.isEmpty) {
+      fileIndex.listFiles(partitionFilters, dataFilters)
+    } else {
+      val inMemoryFileIndex = HoodieInMemoryFileIndex.create(sparkSession, globbedPaths)
+      inMemoryFileIndex.listFiles(partitionFilters, dataFilters)
+    }
+
+    val fsView = new HoodieTableFileSystemView(metaClient, timeline, partitionDirs.flatMap(_.files).toArray)
+    val latestBaseFiles = fsView.getLatestBaseFiles.iterator().asScala.toList.map(_.getFileStatus)
+
+    latestBaseFiles.groupBy(getPartitionPath)
   }
 
-  protected def doBuildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[InternalRow]
+  protected def convertToExpressions(filters: Array[Filter]): Array[Expression] = {
+    val catalystExpressions = HoodieSparkUtils.convertToCatalystExpressions(filters, tableStructSchema)
+
+    val failedExprs = catalystExpressions.zipWithIndex.filter { case (opt, _) => opt.isEmpty }
+    if (failedExprs.nonEmpty) {
+      val failedFilters = failedExprs.map(p => filters(p._2))
+      logWarning(s"Failed to convert Filters into Catalyst expressions (${failedFilters.map(_.toString)})")
+    }
+
+    catalystExpressions.filter(_.isDefined).map(_.get).toArray
+  }
+
+  /**
+   * Checks whether given expression only references partition columns
+   * (and involves no sub-query)
+   */
+  protected def isPartitionPredicate(condition: Expression): Boolean = {
+    // Validates that the provided names both resolve to the same entity
+    val resolvedNameEquals = sparkSession.sessionState.analyzer.resolver
+
+    condition.references.forall { r => partitionColumns.exists(resolvedNameEquals(r.name, _)) } &&
+      !SubqueryExpression.hasSubquery(condition)
+  }
 
   protected final def appendMandatoryColumns(requestedColumns: Array[String]): Array[String] = {
     val missing = mandatoryColumns.filter(col => !requestedColumns.contains(col))
     requestedColumns ++ missing
   }
+
+  private def getPrecombineFieldProperty: Option[String] =
+    Option(tableConfig.getPreCombineField)
+      .orElse(optParams.get(DataSourceWriteOptions.PRECOMBINE_FIELD.key)) match {
+      // NOTE: This is required to compensate for cases when empty string is used to stub
+      //       property value to avoid it being set with the default value
+      // TODO(HUDI-3456) cleanup
+      case Some(f) if !StringUtils.isNullOrEmpty(f) => Some(f)
+      case _ => None
+    }
+
+  private def imbueConfigs(sqlContext: SQLContext): Unit = {
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown", "true")
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled", "true")
+    // TODO(HUDI-3639) vectorized reader has to be disabled to make sure MORIncrementalRelation is working properly
+    sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader", "false")
+  }

Review comment:
       `spark.sql.parquet.enableVectorizedReader = true` to enable vectorization acceleration ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#discussion_r830314442



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -67,41 +49,6 @@ public static boolean doesBelongToIncrementalQuery(FileSplit s) {
     return false;
   }
 
-  // Return parquet file with a list of log files in the same file group.
-  public static List<Pair<Option<HoodieBaseFile>, List<HoodieLogFile>>> groupLogsByBaseFile(Configuration conf, List<Path> partitionPaths) {

Review comment:
       This is removed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065975362


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   * 28607fbed4e475b976e4508c00bea4a5551ca45d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065975770


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869) 
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067278341


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 28607fbed4e475b976e4508c00bea4a5551ca45d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881) 
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067513941


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067327600


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5d53a0d1958010dc3f8e4fd17fc4e514b2edb406 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935) 
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 77afdd4b216c475f9a56b6d005bc507c78134a9f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068752201


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069570286


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     }, {
       "hash" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007",
       "triggerID" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   * 40e5a8537517a19f685367427b00f8a43c3430d8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068616068


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069513082


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     }, {
       "hash" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   * 40e5a8537517a19f685367427b00f8a43c3430d8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069636813


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     }, {
       "hash" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007",
       "triggerID" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * 40e5a8537517a19f685367427b00f8a43c3430d8 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1064793172


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065837783


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   * 3cfbddce843a6daffb2ecaddc1c653a2f29520e2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069361380


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004) 
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069347203


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069404738


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068541450


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067617927


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1065784582


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2940f46a133ca3142f7ebb26b8c6f20583d7f395 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814) 
   * c71cfab947fd81c1aa63a0b8d52f70fa194ade5b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1067572889


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1068567116


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * 920d8e63ac2343edc09a55f35658343eaac613df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955) 
   * 4c0d61041d5e7fef677af1829f0892f3e46f228e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457][Stacked on 4818] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1050289363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * af2c97711d7ff871445d0a78ca4d20b4f05dbd5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218) 
   * 2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4877:
URL: https://github.com/apache/hudi/pull/4877#issuecomment-1069570286


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6218",
       "triggerID" : "af2c97711d7ff871445d0a78ca4d20b4f05dbd5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6281",
       "triggerID" : "2eb193a6f8e5bc45663ab5a53e01ec10d5be99c2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6284",
       "triggerID" : "d875e412abc29bf6a0e8a6fa7bef747ded15d60b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6814",
       "triggerID" : "2940f46a133ca3142f7ebb26b8c6f20583d7f395",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6860",
       "triggerID" : "c71cfab947fd81c1aa63a0b8d52f70fa194ade5b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6869",
       "triggerID" : "3cfbddce843a6daffb2ecaddc1c653a2f29520e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6881",
       "triggerID" : "28607fbed4e475b976e4508c00bea4a5551ca45d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6935",
       "triggerID" : "5d53a0d1958010dc3f8e4fd17fc4e514b2edb406",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59",
       "triggerType" : "PUSH"
     }, {
       "hash" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "379c451648850deb79570b49b185e0a1e7449a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6938",
       "triggerID" : "77afdd4b216c475f9a56b6d005bc507c78134a9f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6948",
       "triggerID" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "920d8e63ac2343edc09a55f35658343eaac613df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6955",
       "triggerID" : "1067572461",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6980",
       "triggerID" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6988",
       "triggerID" : "1068702943",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "be2605813c0f58e6226ca75c25c2bf40574c7a5d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4c0d61041d5e7fef677af1829f0892f3e46f228e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7004",
       "triggerID" : "1069347203",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005",
       "triggerID" : "ec7e1b35d67587dce80cfb813aa1b20df82b8c65",
       "triggerType" : "PUSH"
     }, {
       "hash" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007",
       "triggerID" : "40e5a8537517a19f685367427b00f8a43c3430d8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa17c3dc6e76669fd8897c92ecbdc56eef3e3e59 UNKNOWN
   * 379c451648850deb79570b49b185e0a1e7449a9c UNKNOWN
   * be2605813c0f58e6226ca75c25c2bf40574c7a5d UNKNOWN
   * ec7e1b35d67587dce80cfb813aa1b20df82b8c65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7005) 
   * 40e5a8537517a19f685367427b00f8a43c3430d8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7007) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan merged pull request #4877: [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication

Posted by GitBox <gi...@apache.org>.
xushiyan merged pull request #4877:
URL: https://github.com/apache/hudi/pull/4877


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org