You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "alexeykudinkin (via GitHub)" <gi...@apache.org> on 2023/01/31 07:35:24 UTC

[GitHub] [hudi] alexeykudinkin opened a new pull request, #7804: [HUDI-5656] Fixing NPE while reading `HoodieBootstrapRelation`

alexeykudinkin opened a new pull request, #7804:
URL: https://github.com/apache/hudi/pull/7804

   ### Change Logs
   
   Currently `HoodieBootstrapRelation` is improperly treating partitioned tables resulting in NPE, while trying to read bootstrapped table. To address that `HoodieBootstrapRelation` have been rebased onto `HoodieBaseRelation` providing some of the common semantic across all of the Hudi's file-based Partition implementations (schema handling, file-listing, etc)
   
   ### Impact
   
   Addresses NPE in current implementation of `HoodieBootstrapRelation`
   
   ### Risk level (write none, low medium or high below)
   
   Medium
   
   ### Documentation Update
   
   TBA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on pull request #7804: [HUDI-5656] Fixing NPE while reading `HoodieBootstrapRelation`

Posted by "alexeykudinkin (via GitHub)" <gi...@apache.org>.
alexeykudinkin commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1410759021

   Yes, there will be tests added for it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-5656] Fixing NPE while reading `HoodieBootstrapRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1409944897

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 214938fa79f087400977256140ef633dace60663 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1432244590

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9aa0fe249539a8205625b3cd24c80a1aa07ccac0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069) 
   * f18bb659d5887dff772f261ed1d01e11992a551f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1423604915

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 214938fa79f087400977256140ef633dace60663 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810) 
   * 983f5dced12040e1abd0dedc014a2aac13af5e37 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1442781983

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f18bb659d5887dff772f261ed1d01e11992a551f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216) 
   * 96daf49ab19a803bfe8ce25f1fc9945f685db473 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1442786346

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15376",
       "triggerID" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f18bb659d5887dff772f261ed1d01e11992a551f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216) 
   * 96daf49ab19a803bfe8ce25f1fc9945f685db473 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15376) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1424092294

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 983f5dced12040e1abd0dedc014a2aac13af5e37 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1425050270

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 983f5dced12040e1abd0dedc014a2aac13af5e37 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051) 
   * 9aa0fe249539a8205625b3cd24c80a1aa07ccac0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1432236198

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9aa0fe249539a8205625b3cd24c80a1aa07ccac0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069) 
   * f18bb659d5887dff772f261ed1d01e11992a551f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1432452243

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f18bb659d5887dff772f261ed1d01e11992a551f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-5656] Fixing NPE while reading `HoodieBootstrapRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1409936507

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 214938fa79f087400977256140ef633dace60663 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1423600291

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 214938fa79f087400977256140ef633dace60663 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810) 
   * 983f5dced12040e1abd0dedc014a2aac13af5e37 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "alexeykudinkin (via GitHub)" <gi...@apache.org>.
alexeykudinkin commented on code in PR #7804:
URL: https://github.com/apache/hudi/pull/7804#discussion_r1106165388


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBootstrapRDD.scala:
##########
@@ -51,59 +50,44 @@ class HoodieBootstrapRDD(@transient spark: SparkSession,
       }
     }
 
-    var partitionedFileIterator: Iterator[InternalRow] = null
+    bootstrapPartition.split.skeletonFile match {
+      case Some(skeletonFile) =>
+        // It is a bootstrap split. Check both skeleton and data files.
+        if (bootstrapDataFileReader.schema.isEmpty) {
+          // No data column to fetch, hence fetch only from skeleton file
+          bootstrapSkeletonFileReader.read(skeletonFile)
+        } else if (bootstrapSkeletonFileReader.schema.isEmpty) {
+          // No metadata column to fetch, hence fetch only from data file
+          bootstrapDataFileReader.read(bootstrapPartition.split.dataFile)
+        } else {
+          // Fetch from both data and skeleton file, and merge
+          val dataFileIterator = bootstrapDataFileReader.read(bootstrapPartition.split.dataFile)
+          val skeletonFileIterator = bootstrapSkeletonFileReader.read(skeletonFile)
+          merge(skeletonFileIterator, dataFileIterator)
+        }
 
-    if (bootstrapPartition.split.skeletonFile.isDefined) {
-      // It is a bootstrap split. Check both skeleton and data files.
-      if (dataSchema.isEmpty) {
-        // No data column to fetch, hence fetch only from skeleton file
-        partitionedFileIterator = skeletonReadFunction(bootstrapPartition.split.skeletonFile.get)
-      } else if (skeletonSchema.isEmpty) {
-        // No metadata column to fetch, hence fetch only from data file
-        partitionedFileIterator = dataReadFunction(bootstrapPartition.split.dataFile)
-      } else {
-        // Fetch from both data and skeleton file, and merge
-        val dataFileIterator = dataReadFunction(bootstrapPartition.split.dataFile)
-        val skeletonFileIterator = skeletonReadFunction(bootstrapPartition.split.skeletonFile.get)
-        partitionedFileIterator = merge(skeletonFileIterator, dataFileIterator)
-      }
-    } else {
-      partitionedFileIterator = regularReadFunction(bootstrapPartition.split.dataFile)
+      case _ => regularFileReader.read(bootstrapPartition.split.dataFile)
     }
-    partitionedFileIterator
   }
 
-  def merge(skeletonFileIterator: Iterator[InternalRow], dataFileIterator: Iterator[InternalRow])
-  : Iterator[InternalRow] = {
+  def merge(skeletonFileIterator: Iterator[InternalRow], dataFileIterator: Iterator[InternalRow]): Iterator[InternalRow] = {
     new Iterator[InternalRow] {
-      override def hasNext: Boolean = dataFileIterator.hasNext && skeletonFileIterator.hasNext
-      override def next(): InternalRow = {
-        mergeInternalRow(skeletonFileIterator.next(), dataFileIterator.next())
-      }
-    }
-  }
+      private val combinedRow = new JoinedRow()
 
-  def mergeInternalRow(skeletonRow: InternalRow, dataRow: InternalRow): InternalRow = {
-    val skeletonArr  = skeletonRow.copy().toSeq(skeletonSchema)
-    val dataArr = dataRow.copy().toSeq(dataSchema)

Review Comment:
   Copying `InternalRow`s is punitive performance-wise (better approach is to use `UnsafeProjection` for that)



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala:
##########
@@ -451,10 +455,15 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
    * For enable hoodie.datasource.write.drop.partition.columns, need to create an InternalRow on partition values
    * and pass this reader on parquet file. So that, we can query the partition columns.
    */
-  protected def getPartitionColumnsAsInternalRow(file: FileStatus): InternalRow = {
+
+  protected def getPartitionColumnsAsInternalRow(file: FileStatus): InternalRow =

Review Comment:
   Some of the Base class methods are parameterized to be able to provide for configurability of whether partition-values should be parsed from partition-path (this is required for Bootstrapped relation since this behavior in Spark is unconditiional)



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala:
##########
@@ -574,6 +584,12 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext,
 
   protected def tryPrunePartitionColumns(tableSchema: HoodieTableSchema,
                                          requiredSchema: HoodieTableSchema): (StructType, HoodieTableSchema, HoodieTableSchema) = {
+    tryPrunePartitionColumnsInternal(tableSchema, requiredSchema, shouldExtractPartitionValuesFromPartitionPath)

Review Comment:
   Same comment as above



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBootstrapRelation.scala:
##########
@@ -44,150 +44,169 @@ import scala.collection.JavaConverters._
   * bootstrapped files, because then the metadata file and data file can return different number of rows causing errors
   * merging.
   *
-  * @param _sqlContext Spark SQL Context
+  * @param sqlContext Spark SQL Context
   * @param userSchema User specified schema in the datasource query
   * @param globPaths  The global paths to query. If it not none, read from the globPaths,
   *                   else read data from tablePath using HoodiFileIndex.
   * @param metaClient Hoodie table meta client
   * @param optParams DataSource options passed by the user
   */
-class HoodieBootstrapRelation(@transient val _sqlContext: SQLContext,
-                              val userSchema: Option[StructType],
-                              val globPaths: Seq[Path],
-                              val metaClient: HoodieTableMetaClient,
-                              val optParams: Map[String, String]) extends BaseRelation
-  with PrunedFilteredScan with Logging {
+case class HoodieBootstrapRelation(override val sqlContext: SQLContext,

Review Comment:
   Crux of the change here is rebasing BootstrapRelation onto `HoodieBaseRelation`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1443400674

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15216",
       "triggerID" : "f18bb659d5887dff772f261ed1d01e11992a551f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15376",
       "triggerID" : "96daf49ab19a803bfe8ce25f1fc9945f685db473",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 96daf49ab19a803bfe8ce25f1fc9945f685db473 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15376) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-5656] Fixing NPE while reading `HoodieBootstrapRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1410271293

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 214938fa79f087400977256140ef633dace60663 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1425045529

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 983f5dced12040e1abd0dedc014a2aac13af5e37 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051) 
   * 9aa0fe249539a8205625b3cd24c80a1aa07ccac0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7804:
URL: https://github.com/apache/hudi/pull/7804#issuecomment-1425115940

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "214938fa79f087400977256140ef633dace60663",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14810",
       "triggerID" : "214938fa79f087400977256140ef633dace60663",
       "triggerType" : "PUSH"
     }, {
       "hash" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15051",
       "triggerID" : "983f5dced12040e1abd0dedc014a2aac13af5e37",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069",
       "triggerID" : "9aa0fe249539a8205625b3cd24c80a1aa07ccac0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9aa0fe249539a8205625b3cd24c80a1aa07ccac0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15069) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin merged pull request #7804: [HUDI-915][HUDI-5656] Rebased `HoodieBootstrapRelation` onto `HoodieBaseRelation`

Posted by "alexeykudinkin (via GitHub)" <gi...@apache.org>.
alexeykudinkin merged PR #7804:
URL: https://github.com/apache/hudi/pull/7804


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org