You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/07 00:39:54 UTC

[GitHub] [hudi] zhangyue19921010 opened a new pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

zhangyue19921010 opened a new pull request #3413:
URL: https://github.com/apache/hudi/pull/3413


   https://issues.apache.org/jira/projects/HUDI/issues/HUDI-2277
   
   ## What is the purpose of the pull request
   Develop a new Source named ORCDFSSource extended from RowSource
   
   Now, HoodieDeltaStreamer can read orc files directly using ORCDFSSource.
   
   Also add UTs which are necessary and tested on our local env.
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679) 
   * d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099) 
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 48c1b574e284349887ef8378e00b11aa22d52eb5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449) 
   * 94129782254d77db4b224c5f6f4ae09e7213a6d7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   * f2c826d5571ce18fa61586c7fcdf302cc0bcb95e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 closed pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 closed pull request #3413:
URL: https://github.com/apache/hudi/pull/3413


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-928881572


   Hi @nsivabalan @vinothchandar. Thanks a lot for your attention, review and approve! Could we land it or what else do I need to do?  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-927137018


   @zhangyue19921010 : do ping me here once you have addressed all comments. I can take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 removed a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 removed a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-897231142






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f2c826d5571ce18fa61586c7fcdf302cc0bcb95e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650) 
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r713552896



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,34 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void testORCDFSSource(boolean useSchemaProvider, List<String> transformerClassNames) throws Exception {
+    // prepare ORCDFSSource
+    TypedProperties orcProps = new TypedProperties();
+
+    // Properties used for testing delta-streamer with orc source
+    orcProps.setProperty("include", "base.properties");
+    orcProps.setProperty("hoodie.embed.timeline.server","false");
+    orcProps.setProperty("hoodie.datasource.write.recordkey.field", "_row_key");
+    orcProps.setProperty("hoodie.datasource.write.partitionpath.field", "not_there");
+    if (useSchemaProvider) {
+      orcProps.setProperty("hoodie.deltastreamer.schemaprovider.source.schema.file", dfsBasePath + "/" + "source.avsc");
+      if (transformerClassNames != null) {
+        orcProps.setProperty("hoodie.deltastreamer.schemaprovider.target.schema.file", dfsBasePath + "/" + "target.avsc");
+      }
+    }
+    orcProps.setProperty("hoodie.deltastreamer.source.dfs.root", ORC_SOURCE_ROOT);
+    UtilitiesTestBase.Helpers.savePropsToDFS(orcProps, dfs, dfsBasePath + "/" + PROPS_FILENAME_TEST_ORC);
+
+    String tableBasePath = dfsBasePath + "/test_orc_source_table" + testNum;
+    HoodieDeltaStreamer deltaStreamer = new HoodieDeltaStreamer(
+            TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT, ORCDFSSource.class.getName(),
+                    transformerClassNames, PROPS_FILENAME_TEST_ORC, false,
+                    useSchemaProvider, 100000, false, null, null, "timestamp", null), jsc);
+    deltaStreamer.sync();
+    TestHelpers.assertRecordCount(ORC_NUM_RECORDS, tableBasePath + "/*/*.parquet", sqlContext);

Review comment:
       Hi @nsivabalan Thanks for your review. I think this is .parquet Because this patch is a ORCDFSSource which let HoodieDeltaStreamer can read orc file into hudi table and also use parquet format as base file format. So that we need to use .parquet when reading hudi table data.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310) 
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103) 
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r686610722



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/AvroOrcUtils.java
##########
@@ -796,4 +800,32 @@ private static Schema getActualSchemaType(Schema unionSchema) {
       return Schema.createUnion(nonNullMembers);
     }
   }
+
+  public static void addAvroRecord(

Review comment:
       Sure, Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893610117


   @hudi-bot run travis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099) 
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   * f2c826d5571ce18fa61586c7fcdf302cc0bcb95e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669) 
   * 83668ba8191ff87e8ac7305d1e5a1ae364b7fc95 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f2c826d5571ce18fa61586c7fcdf302cc0bcb95e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650) 
   * cf91f8edfc94e125302f8550d484589418f00c1f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717168485



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,34 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void testORCDFSSource(boolean useSchemaProvider, List<String> transformerClassNames) throws Exception {
+    // prepare ORCDFSSource
+    TypedProperties orcProps = new TypedProperties();
+
+    // Properties used for testing delta-streamer with orc source
+    orcProps.setProperty("include", "base.properties");
+    orcProps.setProperty("hoodie.embed.timeline.server","false");
+    orcProps.setProperty("hoodie.datasource.write.recordkey.field", "_row_key");
+    orcProps.setProperty("hoodie.datasource.write.partitionpath.field", "not_there");
+    if (useSchemaProvider) {
+      orcProps.setProperty("hoodie.deltastreamer.schemaprovider.source.schema.file", dfsBasePath + "/" + "source.avsc");
+      if (transformerClassNames != null) {
+        orcProps.setProperty("hoodie.deltastreamer.schemaprovider.target.schema.file", dfsBasePath + "/" + "target.avsc");
+      }
+    }
+    orcProps.setProperty("hoodie.deltastreamer.source.dfs.root", ORC_SOURCE_ROOT);
+    UtilitiesTestBase.Helpers.savePropsToDFS(orcProps, dfs, dfsBasePath + "/" + PROPS_FILENAME_TEST_ORC);
+
+    String tableBasePath = dfsBasePath + "/test_orc_source_table" + testNum;
+    HoodieDeltaStreamer deltaStreamer = new HoodieDeltaStreamer(
+            TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT, ORCDFSSource.class.getName(),
+                    transformerClassNames, PROPS_FILENAME_TEST_ORC, false,
+                    useSchemaProvider, 100000, false, null, null, "timestamp", null), jsc);
+    deltaStreamer.sync();
+    TestHelpers.assertRecordCount(ORC_NUM_RECORDS, tableBasePath + "/*/*.parquet", sqlContext);

Review comment:
       he he. my bad, got it. thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2458",
       "triggerID" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 47ca8204f928b4486ae486dc4a4a37f45d6cd14d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2458) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717359649



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1936,4 +1972,12 @@ public Schema getTargetSchema() {
     }
   }
 
+  private static Stream<Arguments> testArguments() {

Review comment:
       `testArguments` is not specific enough. use the same method name then `@MethodSource` does not need to have its argument. it'll look for the method with the same name. so this argument provider method will only serve its own test case method
   
   ```suggestion
     private static Stream<Arguments> testORCDFSSource() {
   ```

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1652,12 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @ParameterizedTest
+  @MethodSource("testArguments")

Review comment:
       these 2 should be annotated on `testORCDFSSource()` method itself then this method not needed

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
   public static final Schema AVRO_SCHEMA_WITH_METADATA_FIELDS =
       HoodieAvroUtils.addMetadataFields(AVRO_SCHEMA);
   public static final Schema AVRO_SHORT_TRIP_SCHEMA = new Schema.Parser().parse(SHORT_TRIP_SCHEMA);
   public static final Schema AVRO_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);
+  public static final Schema ORC_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);

Review comment:
       actually what i meant is to do `AvroOrcUtils.createOrcSchema()` here for this constant so callers can use it directly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2458",
       "triggerID" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 94129782254d77db4b224c5f6f4ae09e7213a6d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456) 
   * 47ca8204f928b4486ae486dc4a4a37f45d6cd14d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2458) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103) 
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-913312467


   Or could we land it if possible? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679) 
   * d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r686611096



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1435,6 +1476,26 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndNoTransformer() throws Exception {
+    testORCDFSSource(false, null);
+  }
+
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndTransformer() throws Exception {
+    testORCDFSSource(false, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
+  }
+
+  @Test
+  public void testORCDFSSourceWithSourceSchemaFileAndNoTransformer() throws Exception {

Review comment:
       Okay, done. Thanks a lot for your review :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f2c826d5571ce18fa61586c7fcdf302cc0bcb95e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310) 
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 99414ba1ee89c6cdd2f482425001aec2392d65e9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 99414ba1ee89c6cdd2f482425001aec2392d65e9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312) 
   * 48c1b574e284349887ef8378e00b11aa22d52eb5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-927255399


   > @zhangyue19921010 : do ping me here once you have addressed all comments. I can take a look.
   
   Hi @nsivabalan Thanks a lot for your review. I finished all the changes. PTAL :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 48c1b574e284349887ef8378e00b11aa22d52eb5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 removed a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 removed a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-913312467


   Or could we land it if possible? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717168485



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,34 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void testORCDFSSource(boolean useSchemaProvider, List<String> transformerClassNames) throws Exception {
+    // prepare ORCDFSSource
+    TypedProperties orcProps = new TypedProperties();
+
+    // Properties used for testing delta-streamer with orc source
+    orcProps.setProperty("include", "base.properties");
+    orcProps.setProperty("hoodie.embed.timeline.server","false");
+    orcProps.setProperty("hoodie.datasource.write.recordkey.field", "_row_key");
+    orcProps.setProperty("hoodie.datasource.write.partitionpath.field", "not_there");
+    if (useSchemaProvider) {
+      orcProps.setProperty("hoodie.deltastreamer.schemaprovider.source.schema.file", dfsBasePath + "/" + "source.avsc");
+      if (transformerClassNames != null) {
+        orcProps.setProperty("hoodie.deltastreamer.schemaprovider.target.schema.file", dfsBasePath + "/" + "target.avsc");
+      }
+    }
+    orcProps.setProperty("hoodie.deltastreamer.source.dfs.root", ORC_SOURCE_ROOT);
+    UtilitiesTestBase.Helpers.savePropsToDFS(orcProps, dfs, dfsBasePath + "/" + PROPS_FILENAME_TEST_ORC);
+
+    String tableBasePath = dfsBasePath + "/test_orc_source_table" + testNum;
+    HoodieDeltaStreamer deltaStreamer = new HoodieDeltaStreamer(
+            TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT, ORCDFSSource.class.getName(),
+                    transformerClassNames, PROPS_FILENAME_TEST_ORC, false,
+                    useSchemaProvider, 100000, false, null, null, "timestamp", null), jsc);
+    deltaStreamer.sync();
+    TestHelpers.assertRecordCount(ORC_NUM_RECORDS, tableBasePath + "/*/*.parquet", sqlContext);

Review comment:
       he he. my bad, got it. thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 83668ba8191ff87e8ac7305d1e5a1ae364b7fc95 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672) 
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-898039061


   Hi @vinothchandar Thanks a lot for your review. Now Travis and Azure are all passed. PTAL :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 83668ba8191ff87e8ac7305d1e5a1ae364b7fc95 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r686275274



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1435,6 +1476,26 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndNoTransformer() throws Exception {
+    testORCDFSSource(false, null);
+  }
+
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndTransformer() throws Exception {
+    testORCDFSSource(false, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
+  }
+
+  @Test
+  public void testORCDFSSourceWithSourceSchemaFileAndNoTransformer() throws Exception {

Review comment:
       for sake of test runtime. could we test just 1-2 combos here. most of the testing done for row source (ParquetDFS) should cover already?

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/AvroOrcUtils.java
##########
@@ -796,4 +800,32 @@ private static Schema getActualSchemaType(Schema unionSchema) {
       return Schema.createUnion(nonNullMembers);
     }
   }
+
+  public static void addAvroRecord(

Review comment:
       can this sit somewhere in a test utils class? given its only used by a test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310) 
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 99414ba1ee89c6cdd2f482425001aec2392d65e9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r713553429



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -364,5 +385,32 @@ public static String toJsonString(HoodieRecord hr) {
     public static String[] jsonifyRecords(List<HoodieRecord> records) {
       return records.stream().map(Helpers::toJsonString).toArray(String[]::new);
     }
+
+    public static void addAvroRecord(

Review comment:
       Sure thing. changed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r704389837



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,46 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void prepareORCDFSSource(boolean useSchemaProvider, boolean hasTransformer) throws IOException {
+    prepareORCDFSSource(useSchemaProvider, hasTransformer, "source.avsc", "target.avsc",
+            PROPS_FILENAME_TEST_ORC, ORC_SOURCE_ROOT, false);
+  }
+
+  private void prepareORCDFSSource(boolean useSchemaProvider, boolean hasTransformer, String sourceSchemaFile, String targetSchemaFile,

Review comment:
       Srue thing. done. thanks a lot for your review.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-929987870


   Hi @xushiyan Thanks a lot for your attention and review. My bad for misunderstanding :) code changed and waiting for ci/cd green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r718245885



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);

Review comment:
       ditto
   
   ```suggestion
     public static final TypeDescription ORC_SCHEMA = AvroOrcUtils.createOrcSchema(new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA));
   ```

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
   public static final Schema AVRO_SCHEMA_WITH_METADATA_FIELDS =
       HoodieAvroUtils.addMetadataFields(AVRO_SCHEMA);
   public static final Schema AVRO_SHORT_TRIP_SCHEMA = new Schema.Parser().parse(SHORT_TRIP_SCHEMA);
   public static final Schema AVRO_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);
+  public static final Schema ORC_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);

Review comment:
       @zhangyue19921010 I think I wasn't clear about the suggestion. What i meant is, here in HoodieTestDataGenerator, 
   
   ```suggestion
     public static final TypeDescription ORC_TRIP_SCHEMA = AvroOrcUtils.createOrcSchema(new Schema.Parser().parse(TRIP_SCHEMA));
   ```
   
   This constant is named as `ORC_XXX_SCHEMA` but its type still an avro schema, causing confusion. That's why i suggest do conversion here, and make the constant less confusing and easier to use.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 94129782254d77db4b224c5f6f4ae09e7213a6d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 removed a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 removed a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-913312467


   Or could we land it if possible? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717267707



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -314,6 +320,28 @@ public static void saveParquetToDFS(List<GenericRecord> records, Path targetFile
       }
     }
 
+    public static void saveORCToDFS(List<GenericRecord> records, Path targetFile) throws IOException {
+      TypeDescription orcSchema = AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_SCHEMA);

Review comment:
       ditto

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamerBase.java
##########
@@ -247,4 +254,27 @@ protected static void prepareParquetDFSFiles(int numRecords, String baseParquetP
           dataGenerator.generateInserts("000", numRecords)), new Path(path));
     }
   }
+
+  protected static void prepareORCDFSFiles(int numRecords) throws IOException {
+    prepareORCDFSFiles(numRecords, ORC_SOURCE_ROOT);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath) throws IOException {
+    prepareORCDFSFiles(numRecords, baseORCPath, FIRST_ORC_FILE_NAME, false, null, null);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath, String fileName, boolean useCustomSchema,
+                                               String schemaStr, Schema schema) throws IOException {
+    String path = baseORCPath + "/" + fileName;
+    HoodieTestDataGenerator dataGenerator = new HoodieTestDataGenerator();
+    if (useCustomSchema) {
+      Helpers.saveORCToDFS(Helpers.toGenericRecords(
+              dataGenerator.generateInsertsAsPerSchema("000", numRecords, schemaStr),
+              schema), new Path(path), AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_TRIP_SCHEMA));

Review comment:
       better if add a `HoodieTestDataGenerator.ORC_TRIP_SCHEMA` in the class and convert this inside?

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1651,16 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndNoTransformer() throws Exception {
+    testORCDFSSource(false, null);
+  }
+
+  @Test
+  public void testORCDFSSourceWithSchemaFilesAndTransformer() throws Exception {
+    testORCDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
+  }

Review comment:
       can we use `@ParameterizedTest` here? with `@MethodSource` returning `Stream<Arguments>` to make it cleaner
   

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1936,4 +1972,12 @@ public Schema getTargetSchema() {
     }
   }
 
+  private static Stream<Arguments> testArguments() {

Review comment:
       `testArguments` is not specific enough. use the same method name then `@MethodSource` does not need to have its argument. it'll look for the method with the same name. so this argument provider method will only serve its own test case method
   
   ```suggestion
     private static Stream<Arguments> testORCDFSSource() {
   ```

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1652,12 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @ParameterizedTest
+  @MethodSource("testArguments")

Review comment:
       these 2 should be annotated on `testORCDFSSource()` method itself then this method not needed

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
   public static final Schema AVRO_SCHEMA_WITH_METADATA_FIELDS =
       HoodieAvroUtils.addMetadataFields(AVRO_SCHEMA);
   public static final Schema AVRO_SHORT_TRIP_SCHEMA = new Schema.Parser().parse(SHORT_TRIP_SCHEMA);
   public static final Schema AVRO_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);
+  public static final Schema ORC_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);

Review comment:
       actually what i meant is to do `AvroOrcUtils.createOrcSchema()` here for this constant so callers can use it directly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan merged pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan merged pull request #3413:
URL: https://github.com/apache/hudi/pull/3413


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 99414ba1ee89c6cdd2f482425001aec2392d65e9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312) 
   * 48c1b574e284349887ef8378e00b11aa22d52eb5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103) 
   * 32223149bbb3d0c23e710fd338de4ed63e5f8be8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310) 
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r711845020



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,34 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void testORCDFSSource(boolean useSchemaProvider, List<String> transformerClassNames) throws Exception {
+    // prepare ORCDFSSource
+    TypedProperties orcProps = new TypedProperties();
+
+    // Properties used for testing delta-streamer with orc source
+    orcProps.setProperty("include", "base.properties");
+    orcProps.setProperty("hoodie.embed.timeline.server","false");
+    orcProps.setProperty("hoodie.datasource.write.recordkey.field", "_row_key");
+    orcProps.setProperty("hoodie.datasource.write.partitionpath.field", "not_there");
+    if (useSchemaProvider) {
+      orcProps.setProperty("hoodie.deltastreamer.schemaprovider.source.schema.file", dfsBasePath + "/" + "source.avsc");
+      if (transformerClassNames != null) {
+        orcProps.setProperty("hoodie.deltastreamer.schemaprovider.target.schema.file", dfsBasePath + "/" + "target.avsc");
+      }
+    }
+    orcProps.setProperty("hoodie.deltastreamer.source.dfs.root", ORC_SOURCE_ROOT);
+    UtilitiesTestBase.Helpers.savePropsToDFS(orcProps, dfs, dfsBasePath + "/" + PROPS_FILENAME_TEST_ORC);
+
+    String tableBasePath = dfsBasePath + "/test_orc_source_table" + testNum;
+    HoodieDeltaStreamer deltaStreamer = new HoodieDeltaStreamer(
+            TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT, ORCDFSSource.class.getName(),
+                    transformerClassNames, PROPS_FILENAME_TEST_ORC, false,
+                    useSchemaProvider, 100000, false, null, null, "timestamp", null), jsc);
+    deltaStreamer.sync();
+    TestHelpers.assertRecordCount(ORC_NUM_RECORDS, tableBasePath + "/*/*.parquet", sqlContext);

Review comment:
       shouldn't this be *.orc instead of *.parquet

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -364,5 +385,32 @@ public static String toJsonString(HoodieRecord hr) {
     public static String[] jsonifyRecords(List<HoodieRecord> records) {
       return records.stream().map(Helpers::toJsonString).toArray(String[]::new);
     }
+
+    public static void addAvroRecord(
+            VectorizedRowBatch batch,
+            GenericRecord record,
+            TypeDescription orcSchema,
+            int orcBatchSize,
+            Writer writer
+    ) throws IOException {
+      for (int c = 0; c < batch.numCols; c++) {
+        ColumnVector colVector = batch.cols[c];
+        final String thisField = orcSchema.getFieldNames().get(c);
+        final TypeDescription type = orcSchema.getChildren().get(c);
+
+        Object fieldValue = record.get(thisField);
+        Schema.Field avroField = record.getSchema().getField(thisField);
+        AvroOrcUtils.addToVector(type, colVector, avroField.schema(), fieldValue, batch.size);
+      }
+
+      batch.size++;
+
+      if (batch.size % orcBatchSize == 0 || batch.size == batch.getMaxSize()) {

Review comment:
       can you help me understand what this code block is doing? I see this method is called for one batch of records. lets say there are 100 records in one batch. if I am not wrong, batch.size at the end of adding 100 records should be 100. and so batch.size % orcBatchSize will be equal to 0 only after adding all records. 
   If thats the case, shouldn't we move this block outside of this method. 
   or am I missing something. 

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -364,5 +385,32 @@ public static String toJsonString(HoodieRecord hr) {
     public static String[] jsonifyRecords(List<HoodieRecord> records) {
       return records.stream().map(Helpers::toJsonString).toArray(String[]::new);
     }
+
+    public static void addAvroRecord(

Review comment:
       does this need to be public? can we switch to private. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan merged pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan merged pull request #3413:
URL: https://github.com/apache/hudi/pull/3413


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 99414ba1ee89c6cdd2f482425001aec2392d65e9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 83668ba8191ff87e8ac7305d1e5a1ae364b7fc95 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672) 
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679) 
   * d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-897231142


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r703893680



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1398,6 +1399,46 @@ private void testParquetDFSSource(boolean useSchemaProvider, List<String> transf
     testNum++;
   }
 
+  private void prepareORCDFSSource(boolean useSchemaProvider, boolean hasTransformer) throws IOException {
+    prepareORCDFSSource(useSchemaProvider, hasTransformer, "source.avsc", "target.avsc",
+            PROPS_FILENAME_TEST_ORC, ORC_SOURCE_ROOT, false);
+  }
+
+  private void prepareORCDFSSource(boolean useSchemaProvider, boolean hasTransformer, String sourceSchemaFile, String targetSchemaFile,

Review comment:
       wonder if we can simplify this test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-915063008


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "47ca8204f928b4486ae486dc4a4a37f45d6cd14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 94129782254d77db4b224c5f6f4ae09e7213a6d7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456) 
   * 47ca8204f928b4486ae486dc4a4a37f45d6cd14d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2310",
       "triggerID" : "32223149bbb3d0c23e710fd338de4ed63e5f8be8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7a59556703d2ea881abee407f8fd88291d04d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2312",
       "triggerID" : "99414ba1ee89c6cdd2f482425001aec2392d65e9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449",
       "triggerID" : "48c1b574e284349887ef8378e00b11aa22d52eb5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456",
       "triggerID" : "94129782254d77db4b224c5f6f4ae09e7213a6d7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7a59556703d2ea881abee407f8fd88291d04d80 UNKNOWN
   * 48c1b574e284349887ef8378e00b11aa22d52eb5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2449) 
   * 94129782254d77db4b224c5f6f4ae09e7213a6d7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2456) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3413: [HUDI-2277] Let HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 45fbd4f73a6ccd0918e545702900351a2ed1070b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r718245885



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);

Review comment:
       ditto
   
   ```suggestion
     public static final TypeDescription ORC_SCHEMA = AvroOrcUtils.createOrcSchema(new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA));
   ```

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
   public static final Schema AVRO_SCHEMA_WITH_METADATA_FIELDS =
       HoodieAvroUtils.addMetadataFields(AVRO_SCHEMA);
   public static final Schema AVRO_SHORT_TRIP_SCHEMA = new Schema.Parser().parse(SHORT_TRIP_SCHEMA);
   public static final Schema AVRO_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);
+  public static final Schema ORC_TRIP_SCHEMA = new Schema.Parser().parse(TRIP_SCHEMA);

Review comment:
       @zhangyue19921010 I think I wasn't clear about the suggestion. What i meant is, here in HoodieTestDataGenerator, 
   
   ```suggestion
     public static final TypeDescription ORC_TRIP_SCHEMA = AvroOrcUtils.createOrcSchema(new Schema.Parser().parse(TRIP_SCHEMA));
   ```
   
   This constant is named as `ORC_XXX_SCHEMA` but its type still an avro schema, causing confusion. That's why i suggest do conversion here, and make the constant less confusing and easier to use.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-928881572


   Hi @nsivabalan @vinothchandar. Thanks a lot for your attention, review and approve! Could we land it or what else do I need to do?  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103",
       "triggerID" : "915063008",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2100) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2103) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cf91f8edfc94e125302f8550d484589418f00c1f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669) 
   * 83668ba8191ff87e8ac7305d1e5a1ae364b7fc95 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099",
       "triggerID" : "d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7a21d39bce12b04c3663d8966e9923145b2ce234",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d1cb1fb267bb5b9ebffee5c4de56c220c79d0d68 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2099) 
   * 7a21d39bce12b04c3663d8966e9923145b2ce234 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-913297862


   Hi @vinothchandar sorry to bother you. Since this patch is passed all ut/it. So could you please take a look at your convince? Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r713565743



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -364,5 +385,32 @@ public static String toJsonString(HoodieRecord hr) {
     public static String[] jsonifyRecords(List<HoodieRecord> records) {
       return records.stream().map(Helpers::toJsonString).toArray(String[]::new);
     }
+
+    public static void addAvroRecord(
+            VectorizedRowBatch batch,
+            GenericRecord record,
+            TypeDescription orcSchema,
+            int orcBatchSize,
+            Writer writer
+    ) throws IOException {
+      for (int c = 0; c < batch.numCols; c++) {
+        ColumnVector colVector = batch.cols[c];
+        final String thisField = orcSchema.getFieldNames().get(c);
+        final TypeDescription type = orcSchema.getChildren().get(c);
+
+        Object fieldValue = record.get(thisField);
+        Schema.Field avroField = record.getSchema().getField(thisField);
+        AvroOrcUtils.addToVector(type, colVector, avroField.schema(), fieldValue, batch.size);
+      }
+
+      batch.size++;
+
+      if (batch.size % orcBatchSize == 0 || batch.size == batch.getMaxSize()) {

Review comment:
       Sure thing, changed :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-929987870


   Hi @xushiyan Thanks a lot for your attention and review. My bad for misunderstanding :) code changed and waiting for ci/cd green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717267707



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -314,6 +320,28 @@ public static void saveParquetToDFS(List<GenericRecord> records, Path targetFile
       }
     }
 
+    public static void saveORCToDFS(List<GenericRecord> records, Path targetFile) throws IOException {
+      TypeDescription orcSchema = AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_SCHEMA);

Review comment:
       ditto

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamerBase.java
##########
@@ -247,4 +254,27 @@ protected static void prepareParquetDFSFiles(int numRecords, String baseParquetP
           dataGenerator.generateInserts("000", numRecords)), new Path(path));
     }
   }
+
+  protected static void prepareORCDFSFiles(int numRecords) throws IOException {
+    prepareORCDFSFiles(numRecords, ORC_SOURCE_ROOT);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath) throws IOException {
+    prepareORCDFSFiles(numRecords, baseORCPath, FIRST_ORC_FILE_NAME, false, null, null);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath, String fileName, boolean useCustomSchema,
+                                               String schemaStr, Schema schema) throws IOException {
+    String path = baseORCPath + "/" + fileName;
+    HoodieTestDataGenerator dataGenerator = new HoodieTestDataGenerator();
+    if (useCustomSchema) {
+      Helpers.saveORCToDFS(Helpers.toGenericRecords(
+              dataGenerator.generateInsertsAsPerSchema("000", numRecords, schemaStr),
+              schema), new Path(path), AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_TRIP_SCHEMA));

Review comment:
       better if add a `HoodieTestDataGenerator.ORC_TRIP_SCHEMA` in the class and convert this inside?

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1651,16 @@ public void testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndNoTransformer() throws Exception {
+    testORCDFSSource(false, null);
+  }
+
+  @Test
+  public void testORCDFSSourceWithSchemaFilesAndTransformer() throws Exception {
+    testORCDFSSource(true, Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
+  }

Review comment:
       can we use `@ParameterizedTest` here? with `@MethodSource` returning `Stream<Arguments>` to make it cleaner
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "45fbd4f73a6ccd0918e545702900351a2ed1070b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1401",
       "triggerID" : "893610117",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1650",
       "triggerID" : "f2c826d5571ce18fa61586c7fcdf302cc0bcb95e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1656",
       "triggerID" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf91f8edfc94e125302f8550d484589418f00c1f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1669",
       "triggerID" : "897231142",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1672",
       "triggerID" : "83668ba8191ff87e8ac7305d1e5a1ae364b7fc95",
       "triggerType" : "PUSH"
     }, {
       "hash" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679",
       "triggerID" : "285465473e7e6dbe13b28bb182515d3005c4d1ef",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 285465473e7e6dbe13b28bb182515d3005c4d1ef Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1679) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org