You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/05/13 14:29:53 UTC

[GitHub] [hudi] codope opened a new pull request, #5578: [WIP][HUDI-4025] Add Presto query node to validate presto integration

codope opened a new pull request, #5578:
URL: https://github.com/apache/hudi/pull/5578

   ## What is the purpose of the pull request
   
   Just like HiveQueryNode, this PR adds query nodes for other engines. Currently only PrestoQueryNode is added and tested in docker setup. TODO: Add TrinoQueryNode and check in EKS setup.
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [WIP][HUDI-4025] Add Presto query node to validate presto integration

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1126266988

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c5038e85e9b9a0f642240f78da3619115b1ee30 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [WIP][HUDI-4025] Add Presto query node to validate presto integration

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1126155332

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c5038e85e9b9a0f642240f78da3619115b1ee30 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1201541240

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "738001598a7c0cfd38b006030665ce86210146f8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10505",
       "triggerID" : "738001598a7c0cfd38b006030665ce86210146f8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c5038e85e9b9a0f642240f78da3619115b1ee30 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648) 
   * 738001598a7c0cfd38b006030665ce86210146f8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10505) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1201689729

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "738001598a7c0cfd38b006030665ce86210146f8",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10505",
       "triggerID" : "738001598a7c0cfd38b006030665ce86210146f8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 738001598a7c0cfd38b006030665ce86210146f8 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10505) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on code in PR #5578:
URL: https://github.com/apache/hudi/pull/5578#discussion_r934816613


##########
hudi-integ-test/src/test/resources/unit-test-cow-dag.yaml:
##########
@@ -71,4 +71,24 @@ dag_content:
         query2: "select count(*) from testdb1.table1 group   by `_row_key` having count(*) > 1"
         result2: 0
     type: HiveQueryNode
-    deps: first_hive_sync
\ No newline at end of file
+    deps: first_hive_sync
+  first_presto_query:
+    config:
+      presto_props:
+        prop1: "SET SESSION hive.parquet_use_column_names = true"
+      presto_queries:
+        query1: "select count(*) from testdb1.table1"
+        result1: 300
+        query2: "select count(*) from testdb1.table1 group   by `_row_key` having count(*) > 1"
+        result2: 0
+    type: PrestoQueryNode
+    deps: first_hive_query
+  first_trino_query:
+    config:
+      trino_queries:

Review Comment:
   so, trino does not require to set `SET SESSION hive.parquet_use_column_names = true` is it ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope merged pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
codope merged PR #5578:
URL: https://github.com/apache/hudi/pull/5578


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1201535041

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "738001598a7c0cfd38b006030665ce86210146f8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "738001598a7c0cfd38b006030665ce86210146f8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c5038e85e9b9a0f642240f78da3619115b1ee30 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648) 
   * 738001598a7c0cfd38b006030665ce86210146f8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5578: [WIP][HUDI-4025] Add Presto query node to validate presto integration

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5578:
URL: https://github.com/apache/hudi/pull/5578#issuecomment-1126158730

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648",
       "triggerID" : "6c5038e85e9b9a0f642240f78da3619115b1ee30",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c5038e85e9b9a0f642240f78da3619115b1ee30 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8648) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on a diff in pull request #5578: [HUDI-4025] Add Presto and Trino query node to validate queries

Posted by GitBox <gi...@apache.org>.
codope commented on code in PR #5578:
URL: https://github.com/apache/hudi/pull/5578#discussion_r935073132


##########
hudi-integ-test/src/test/resources/unit-test-cow-dag.yaml:
##########
@@ -71,4 +71,24 @@ dag_content:
         query2: "select count(*) from testdb1.table1 group   by `_row_key` having count(*) > 1"
         result2: 0
     type: HiveQueryNode
-    deps: first_hive_sync
\ No newline at end of file
+    deps: first_hive_sync
+  first_presto_query:
+    config:
+      presto_props:
+        prop1: "SET SESSION hive.parquet_use_column_names = true"
+      presto_queries:
+        query1: "select count(*) from testdb1.table1"
+        result1: 300
+        query2: "select count(*) from testdb1.table1 group   by `_row_key` having count(*) > 1"
+        result2: 0
+    type: PrestoQueryNode
+    deps: first_hive_query
+  first_trino_query:
+    config:
+      trino_queries:

Review Comment:
   Not sure right now. In my local testing, it is not required. I'll check on EKS and if needed i'll add it in a followup PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org