You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/03 12:34:00 UTC

[GitHub] [hudi] pengzhiwei2018 opened a new pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

pengzhiwei2018 opened a new pull request #3393:
URL: https://github.com/apache/hudi/pull/3393


   
   ## What is the purpose of the pull request
   
   For the exist hoodie table write by spark datasource, we can create an external table on the table location, just like this:
   `create table h0 using hudi options(primaryKey = 'id', preCombineField = 'ts') partitioned by(dt) location '/xx/xx/h0'`
   we can modify the hoodie.properites  to add the missing properties in the CreateHoodieTableCommand. After this operation, we can read and write the exist table by spark sql. (edited) 
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361) 
   * 0af1f7f32ef4135c18336a91532b90a38a227cc2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366) 
   * b75f709b3759b283822cc55f8c3030bc57fc4122 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3aec37c6f238f4ed40672d20f7b7c6e6314a9519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350) 
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 53186a44289eb1740d15d0b35afc9d9f3c9f5904 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r682235533



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
##########
@@ -268,7 +269,25 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi
       } else {
         l
       }
-
+    // Fill schema for Create Table without specify schema info
+    case c @ CreateTable(tableDesc, _, _)
+      if isHoodieTable(tableDesc) =>
+        val tablePath = getTableLocation(c.tableDesc, sparkSession)
+          .getOrElse(s"Missing location defined in table ${c.tableDesc.identifier}")

Review comment:
       > is it that for a new table thats getting created, tablePath will be set to "Missing location defined in table ..." ? In other words, if table already existed, tablePath will be set to right value, if not, it will be set to this string.
   > Is my understanding right? If so, why can't we return right away if there table does not exist to maintain the same flow as before for tables that are just getting created.
   
   My mistake! I want to throw an Exception here. Will fix it soon.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-894603173


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-894595802


   @hudi-bot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3393:
URL: https://github.com/apache/hudi/pull/3393


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * caa79aef468f6e82de2c84e7b35975326c42b626 UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "894607773",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * caa79aef468f6e82de2c84e7b35975326c42b626 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892329397


   > Hey peng. I did a round of testing on this patch. Here are my findings.
   > 
   > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > 
   > ```
   > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > Time taken: 0.524 seconds, Fetched 2 row(s)
   > ```
   > 
   > 1st row was part of the table before onboarding to spark-sql.
   > 2nd row was inserted using insert into.
   > 
   > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   
   sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892526491


   > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > ```
   > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > 1st row was part of the table before onboarding to spark-sql.
   > > 2nd row was inserted using insert into.
   > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > 
   > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   
   Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 53186a44289eb1740d15d0b35afc9d9f3c9f5904 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331) 
   * 3aec37c6f238f4ed40672d20f7b7c6e6314a9519 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ee788e31ffff71eb1e684427b4b10f2604385851 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441) 
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361) 
   * 0af1f7f32ef4135c18336a91532b90a38a227cc2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was part of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key.  While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same  for `ComplexKeyGenerator` and `SimpleKeyGenerator`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r683847551



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +277,154 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir { tmp =>
+      // Write a table by spark dataframe.
+      Seq("2021-08-02", "2021/08/02").foreach { partitionValue =>
+        val tableName = generateTableName
+        val tablePath = s"${tmp.getCanonicalPath}/$tableName"
+        import spark.implicits._
+        val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")

Review comment:
       thanks for expanding the test scope. 
   In general, there are two types of partitioning. Simple and multi-level. Can you please add a test for multi-level as well (I mean, there are more, but these are most commonly used in general). For now, we are manually testing these to ensure fix looks good. We should be able to cover all these via unit tests itself. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ee788e31ffff71eb1e684427b4b10f2604385851 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444) 
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892777427


   > > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > > ```
   > > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > > ```
   > > > 
   > > > 
   > > >     
   > > >       
   > > >     
   > > > 
   > > >       
   > > >     
   > > > 
   > > >     
   > > >   
   > > > 1st row was part of the table before onboarding to spark-sql.
   > > > 2nd row was inserted using insert into.
   > > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > > 
   > > 
   > > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   > 
   > Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~
   
   @pengzhiwei2018 I tested the patch. I can see the column names are no longer being prefixed. Updates and deletes by record key is working fine now. However., the uri encoding of partition path is still an issue. For example, I did an insert to an existing partition. The insert was successful but it created a new partition as below:
   ```
   insert into hudi_trips_cow values(1.0, 2.0, "driver_2", 3.0, 4.0, 100.0, "rider_2", 12345, "765544i-e89b-12d3-a456-426655440000", "americas/united_states/san_francisco/");
   
   % ls -l /private/tmp/hudi_trips_cow
   total 0
   drwxr-xr-x  4 sagars  wheel  128 Aug  4 16:49 americas
   drwxr-xr-x  6 sagars  wheel  192 Aug  4 16:50 americas%2Funited_states%2Fsan_francisco%2F
   drwxr-xr-x  3 sagars  wheel   96 Aug  4 16:49 asia
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r682236278



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +278,48 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir{tmp =>
+      // Write a table by spark dataframe.
+      val tableName = generateTableName
+      import spark.implicits._
+      val df = Seq((1, "a1", 10, 1000, "2021-08-02")).toDF("id", "name", "value", "ts", "dt")
+      df.write.format("hudi")
+        .option(HoodieWriteConfig.TABLE_NAME.key, tableName)
+        .option(TABLE_TYPE_OPT_KEY.key, COW_TABLE_TYPE_OPT_VAL)
+        .option(RECORDKEY_FIELD_OPT_KEY.key, "id")
+        .option(PRECOMBINE_FIELD_OPT_KEY.key, "ts")
+        .option(PARTITIONPATH_FIELD_OPT_KEY.key, "dt")
+        .option(KEYGENERATOR_CLASS_OPT_KEY.key, classOf[ComplexKeyGenerator].getName)
+        .option(HoodieWriteConfig.INSERT_PARALLELISM.key, "1")
+        .option(HoodieWriteConfig.UPSERT_PARALLELISM.key, "1")
+        .mode(SaveMode.Overwrite)
+        .save(tmp.getCanonicalPath)
+
+      // Create a table over the exist old table.
+      spark.sql(
+        s"""
+           |create table $tableName using hudi
+           | options (
+           | primaryKey = 'id',
+           | preCombineField = 'ts'
+           |)
+           |partitioned by (dt)
+           |location '${tmp.getCanonicalPath}'
+           |""".stripMargin)
+      checkAnswer(s"select id, name, value, ts, dt from $tableName")(
+        Seq(1, "a1", 10, 1000, "2021-08-02")
+      )
+      // Check the missing properties for spark sql
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(tmp.getCanonicalPath)
+        .setConf(spark.sessionState.newHadoopConf())
+        .build()
+      val properties = metaClient.getTableConfig.getProps.asScala.toMap
+      assertResult(true)(properties.contains(HoodieTableConfig.HOODIE_TABLE_CREATE_SCHEMA.key))
+      assertResult("dt")(properties(HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key))
+      assertResult("ts")(properties(HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key))

Review comment:
       Yes, will add more test case for this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r683847551



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +277,154 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir { tmp =>
+      // Write a table by spark dataframe.
+      Seq("2021-08-02", "2021/08/02").foreach { partitionValue =>
+        val tableName = generateTableName
+        val tablePath = s"${tmp.getCanonicalPath}/$tableName"
+        import spark.implicits._
+        val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")

Review comment:
       thanks for expanding the test scope. 
   In general, there are two types of partitioning. Simple and multi-level. Can you please add a test for multi-level as well. For now, we are manually testing these to ensure fix looks good. We should be able to cover all these via unit tests itself. 

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +277,154 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir { tmp =>
+      // Write a table by spark dataframe.
+      Seq("2021-08-02", "2021/08/02").foreach { partitionValue =>
+        val tableName = generateTableName
+        val tablePath = s"${tmp.getCanonicalPath}/$tableName"
+        import spark.implicits._
+        val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")
+        df.write.format("hudi")
+          .option(HoodieWriteConfig.TABLE_NAME.key, tableName)
+          .option(TABLE_TYPE.key, COW_TABLE_TYPE_OPT_VAL)
+          .option(RECORDKEY_FIELD.key, "id")
+          .option(PRECOMBINE_FIELD.key, "ts")
+          .option(PARTITIONPATH_FIELD.key, "dt")
+          .option(KEYGENERATOR_CLASS.key, classOf[SimpleKeyGenerator].getName)
+          .option(HoodieWriteConfig.INSERT_PARALLELISM.key, "1")
+          .option(HoodieWriteConfig.UPSERT_PARALLELISM.key, "1")
+          .mode(SaveMode.Overwrite)
+          .save(tablePath)
+
+        // Create a table over the exist old table.
+        spark.sql(
+          s"""
+             |create table $tableName using hudi
+             | options (
+             | primaryKey = 'id',
+             | preCombineField = 'ts'
+             |)
+             |partitioned by (dt)
+             |location '$tablePath'
+             |""".stripMargin)
+        checkAnswer(s"select id, name, value, ts, dt from $tableName")(
+          Seq(1, "a1", 10, 1000, partitionValue)
+        )
+        // Check the missing properties for spark sql
+        val metaClient = HoodieTableMetaClient.builder()
+          .setBasePath(tablePath)
+          .setConf(spark.sessionState.newHadoopConf())
+          .build()
+        val properties = metaClient.getTableConfig.getProps.asScala.toMap
+        assertResult(true)(properties.contains(HoodieTableConfig.HOODIE_TABLE_CREATE_SCHEMA.key))
+        assertResult("dt")(properties(HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key))
+        assertResult("ts")(properties(HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key))
+
+        // Test insert into
+        spark.sql(s"insert into $tableName values(2, 'a2', 10, 1000, '$partitionValue')")
+        checkAnswer(s"select id, name, value, ts, dt from $tableName order by id")(
+          Seq(1, "a1", 10, 1000, partitionValue),

Review comment:
       Can we please add validations for meta fields. i.e. _hoodie_record_key and _hoodie_partition_path. we are seeing issues around url encoding. So, good to cover those here in unit tests. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b75f709b3759b283822cc55f8c3030bc57fc4122 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457",
       "triggerID" : "894607773",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * caa79aef468f6e82de2c84e7b35975326c42b626 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1457) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 removed a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 removed a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-894603173


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 797b288a393089f2c09f7013ac5b378d8946bee9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r681950625



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
##########
@@ -268,7 +269,25 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi
       } else {
         l
       }
-
+    // Fill schema for Create Table without specify schema info
+    case c @ CreateTable(tableDesc, _, _)
+      if isHoodieTable(tableDesc) =>
+        val tablePath = getTableLocation(c.tableDesc, sparkSession)
+          .getOrElse(s"Missing location defined in table ${c.tableDesc.identifier}")

Review comment:
       is it that for a new table thats getting created, tablePath will be set to "Missing location defined in table ..." ? In other words, if table already existed, tablePath will be set to right value, if not, it will be set to this string. 
   Is my understanding right? If so, why can't we return right away if there table does not exist to maintain the same flow as before for tables that are just getting created. 
    

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
##########
@@ -268,7 +269,25 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi
       } else {
         l
       }
-
+    // Fill schema for Create Table without specify schema info
+    case c @ CreateTable(tableDesc, _, _)
+      if isHoodieTable(tableDesc) =>
+        val tablePath = getTableLocation(c.tableDesc, sparkSession)
+          .getOrElse(s"Missing location defined in table ${c.tableDesc.identifier}")
+        val metaClient = HoodieTableMetaClient.builder()
+          .setBasePath(tablePath)
+          .setConf(sparkSession.sessionState.newHadoopConf())
+          .build()
+        val tableSchema = HoodieSqlUtils.getTableSqlSchema(metaClient).map(HoodieSqlUtils.addMetaFields)
+        if (tableSchema.isDefined && tableDesc.schema.isEmpty) {
+          // Fill the schema with the schema from the table
+          c.copy(tableDesc.copy(schema = tableSchema.get))
+        } else if (tableSchema.isDefined && tableDesc.schema != tableSchema.get) {

Review comment:
       should we do "!=" or ".equals()" for schema comparison?

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
##########
@@ -306,34 +299,49 @@ case class CreateHoodieTableCommand(table: CatalogTable, ignoreIfExists: Boolean
 object CreateHoodieTableCommand extends Logging {
 
   /**
-    * Init the table if it is not exists.
-    * @param sparkSession
-    * @param table
-    * @return
+    * Init the hoodie.properties.
     */
   def initTableIfNeed(sparkSession: SparkSession, table: CatalogTable): Unit = {
     val location = getTableLocation(table, sparkSession).getOrElse(
       throw new IllegalArgumentException(s"Missing location for ${table.identifier}"))
 
     val conf = sparkSession.sessionState.newHadoopConf()
     // Init the hoodie table
-    if (!tableExistsInPath(location, conf)) {
-      val tableName = table.identifier.table
-      logInfo(s"Table $tableName is not exists, start to create the hudi table")
+    val originTableConfig = if (tableExistsInPath(location, conf)) {
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(location)
+        .setConf(conf)
+        .build()
+      metaClient.getTableConfig.getProps.asScala.toMap
+    } else {
+      Map.empty[String, String]
+    }
 
-      // Save all the table config to the hoodie.properties.
-      val parameters = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
-      val properties = new Properties()
+    val tableName = table.identifier.table
+    logInfo(s"Init hoodie.properties for $tableName")
+    val tableOptions = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key)

Review comment:
       may I know why record key field is not validated?

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +278,48 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir{tmp =>
+      // Write a table by spark dataframe.
+      val tableName = generateTableName
+      import spark.implicits._
+      val df = Seq((1, "a1", 10, 1000, "2021-08-02")).toDF("id", "name", "value", "ts", "dt")
+      df.write.format("hudi")
+        .option(HoodieWriteConfig.TABLE_NAME.key, tableName)
+        .option(TABLE_TYPE_OPT_KEY.key, COW_TABLE_TYPE_OPT_VAL)
+        .option(RECORDKEY_FIELD_OPT_KEY.key, "id")
+        .option(PRECOMBINE_FIELD_OPT_KEY.key, "ts")
+        .option(PARTITIONPATH_FIELD_OPT_KEY.key, "dt")
+        .option(KEYGENERATOR_CLASS_OPT_KEY.key, classOf[ComplexKeyGenerator].getName)
+        .option(HoodieWriteConfig.INSERT_PARALLELISM.key, "1")
+        .option(HoodieWriteConfig.UPSERT_PARALLELISM.key, "1")
+        .mode(SaveMode.Overwrite)
+        .save(tmp.getCanonicalPath)
+
+      // Create a table over the exist old table.
+      spark.sql(
+        s"""
+           |create table $tableName using hudi
+           | options (
+           | primaryKey = 'id',
+           | preCombineField = 'ts'
+           |)
+           |partitioned by (dt)
+           |location '${tmp.getCanonicalPath}'
+           |""".stripMargin)
+      checkAnswer(s"select id, name, value, ts, dt from $tableName")(
+        Seq(1, "a1", 10, 1000, "2021-08-02")
+      )
+      // Check the missing properties for spark sql
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(tmp.getCanonicalPath)
+        .setConf(spark.sessionState.newHadoopConf())
+        .build()
+      val properties = metaClient.getTableConfig.getProps.asScala.toMap
+      assertResult(true)(properties.contains(HoodieTableConfig.HOODIE_TABLE_CREATE_SCHEMA.key))
+      assertResult("dt")(properties(HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key))
+      assertResult("ts")(properties(HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key))

Review comment:
       Can we do some basic table operations as well. 
   1. create hudi table w/ spark data source with few records(lets call this as old record). 
   2. create the same table in spark-sql. 
      a. ensure table props match
      b. INSERT into works 
      c. UPDATE -> updates both old records and new records 
      d. DELETE -> again, ensure both old records and new records that matched are deleted.. 
      e. MERGE INTO -> ensure there is a match for both old record and a new record as well. both should get updated. 

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
##########
@@ -268,7 +269,25 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi
       } else {
         l
       }
-
+    // Fill schema for Create Table without specify schema info
+    case c @ CreateTable(tableDesc, _, _)
+      if isHoodieTable(tableDesc) =>
+        val tablePath = getTableLocation(c.tableDesc, sparkSession)
+          .getOrElse(s"Missing location defined in table ${c.tableDesc.identifier}")

Review comment:
       Or will this case c @ CreateTable(tableDesc, _, _) will be invoked only if table already exists? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b75f709b3759b283822cc55f8c3030bc57fc4122 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367) 
   * 314a8f66727958ac7830c9d82e8a7ea97be74900 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a575eba527237c43ec6b150e105005adfce07f80 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454) 
   *  Unknown: [CANCELED](TBD) 
   * caa79aef468f6e82de2c84e7b35975326c42b626 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was part of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key.  While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same  for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   
   > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > ```
   > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > 1st row was part of the table before onboarding to spark-sql.
   > > 2nd row was inserted using insert into.
   > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > 
   > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   
   Yes, I understand you concern. I am trying to solve this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 53186a44289eb1740d15d0b35afc9d9f3c9f5904 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3aec37c6f238f4ed40672d20f7b7c6e6314a9519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0af1f7f32ef4135c18336a91532b90a38a227cc2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366) 
   * b75f709b3759b283822cc55f8c3030bc57fc4122 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445) 
   * 797b288a393089f2c09f7013ac5b378d8946bee9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * ee788e31ffff71eb1e684427b4b10f2604385851 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444) 
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445) 
   * 797b288a393089f2c09f7013ac5b378d8946bee9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r682235910



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
##########
@@ -268,7 +269,25 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi
       } else {
         l
       }
-
+    // Fill schema for Create Table without specify schema info
+    case c @ CreateTable(tableDesc, _, _)
+      if isHoodieTable(tableDesc) =>
+        val tablePath = getTableLocation(c.tableDesc, sparkSession)
+          .getOrElse(s"Missing location defined in table ${c.tableDesc.identifier}")
+        val metaClient = HoodieTableMetaClient.builder()
+          .setBasePath(tablePath)
+          .setConf(sparkSession.sessionState.newHadoopConf())
+          .build()
+        val tableSchema = HoodieSqlUtils.getTableSqlSchema(metaClient).map(HoodieSqlUtils.addMetaFields)
+        if (tableSchema.isDefined && tableDesc.schema.isEmpty) {
+          // Fill the schema with the schema from the table
+          c.copy(tableDesc.copy(schema = tableSchema.get))
+        } else if (tableSchema.isDefined && tableDesc.schema != tableSchema.get) {

Review comment:
       well, In scala, we usually use "=" or "!=" test compile two object. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. 
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was part of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361) 
   * 0af1f7f32ef4135c18336a91532b90a38a227cc2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-894607773


   @hudi-bot run azure
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 314a8f66727958ac7830c9d82e8a7ea97be74900 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-893180101


   > > > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > > > ```
   > > > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > > > ```
   > > > > 
   > > > > 
   > > > >     
   > > > >       
   > > > >     
   > > > > 
   > > > >       
   > > > >     
   > > > > 
   > > > >     
   > > > >   
   > > > > 1st row was part of the table before onboarding to spark-sql.
   > > > > 2nd row was inserted using insert into.
   > > > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > > > 
   > > > 
   > > > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   > > 
   > > 
   > > Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~
   > 
   > @pengzhiwei2018 I tested the patch. I can see the column names are no longer being prefixed. Updates and deletes by record key is working fine now. However., the uri encoding of partition path is still an issue. For example, I did an insert to an existing partition. The insert was successful but it created a new partition as below:
   > 
   > ```
   > insert into hudi_trips_cow values(1.0, 2.0, "driver_2", 3.0, 4.0, 100.0, "rider_2", 12345, "765544i-e89b-12d3-a456-426655440000", "americas/united_states/san_francisco/");
   > 
   > % ls -l /private/tmp/hudi_trips_cow
   > total 0
   > drwxr-xr-x  4 sagars  wheel  128 Aug  4 16:49 americas
   > drwxr-xr-x  6 sagars  wheel  192 Aug  4 16:50 americas%2Funited_states%2Fsan_francisco%2F
   > drwxr-xr-x  3 sagars  wheel   96 Aug  4 16:49 asia
   > ```
   
   Hi @codope , can you drop the table and create again with the latest code of this patch?  I am afraid this happen because you create the table by the old code of the patch. I have fix this issue in the latest .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 53186a44289eb1740d15d0b35afc9d9f3c9f5904 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331) 
   * 3aec37c6f238f4ed40672d20f7b7c6e6314a9519 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 921d21fc6732fe51296e21fd3a26fa11c16bfca3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-894356017


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 797b288a393089f2c09f7013ac5b378d8946bee9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b75f709b3759b283822cc55f8c3030bc57fc4122 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367) 
   * 314a8f66727958ac7830c9d82e8a7ea97be74900 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r684324260



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +277,154 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir { tmp =>
+      // Write a table by spark dataframe.
+      Seq("2021-08-02", "2021/08/02").foreach { partitionValue =>
+        val tableName = generateTableName
+        val tablePath = s"${tmp.getCanonicalPath}/$tableName"
+        import spark.implicits._
+        val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")

Review comment:
       done!

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala
##########
@@ -272,4 +277,154 @@ class TestCreateTable extends TestHoodieSqlBase {
       )
     }
   }
+
+  test("Test Create Table From Exist Hoodie Table") {
+    withTempDir { tmp =>
+      // Write a table by spark dataframe.
+      Seq("2021-08-02", "2021/08/02").foreach { partitionValue =>
+        val tableName = generateTableName
+        val tablePath = s"${tmp.getCanonicalPath}/$tableName"
+        import spark.implicits._
+        val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")
+        df.write.format("hudi")
+          .option(HoodieWriteConfig.TABLE_NAME.key, tableName)
+          .option(TABLE_TYPE.key, COW_TABLE_TYPE_OPT_VAL)
+          .option(RECORDKEY_FIELD.key, "id")
+          .option(PRECOMBINE_FIELD.key, "ts")
+          .option(PARTITIONPATH_FIELD.key, "dt")
+          .option(KEYGENERATOR_CLASS.key, classOf[SimpleKeyGenerator].getName)
+          .option(HoodieWriteConfig.INSERT_PARALLELISM.key, "1")
+          .option(HoodieWriteConfig.UPSERT_PARALLELISM.key, "1")
+          .mode(SaveMode.Overwrite)
+          .save(tablePath)
+
+        // Create a table over the exist old table.
+        spark.sql(
+          s"""
+             |create table $tableName using hudi
+             | options (
+             | primaryKey = 'id',
+             | preCombineField = 'ts'
+             |)
+             |partitioned by (dt)
+             |location '$tablePath'
+             |""".stripMargin)
+        checkAnswer(s"select id, name, value, ts, dt from $tableName")(
+          Seq(1, "a1", 10, 1000, partitionValue)
+        )
+        // Check the missing properties for spark sql
+        val metaClient = HoodieTableMetaClient.builder()
+          .setBasePath(tablePath)
+          .setConf(spark.sessionState.newHadoopConf())
+          .build()
+        val properties = metaClient.getTableConfig.getProps.asScala.toMap
+        assertResult(true)(properties.contains(HoodieTableConfig.HOODIE_TABLE_CREATE_SCHEMA.key))
+        assertResult("dt")(properties(HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key))
+        assertResult("ts")(properties(HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key))
+
+        // Test insert into
+        spark.sql(s"insert into $tableName values(2, 'a2', 10, 1000, '$partitionValue')")
+        checkAnswer(s"select id, name, value, ts, dt from $tableName order by id")(
+          Seq(1, "a1", 10, 1000, partitionValue),

Review comment:
       done!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-893180101


   > > > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > > > ```
   > > > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > > > ```
   > > > > 
   > > > > 
   > > > >     
   > > > >       
   > > > >     
   > > > > 
   > > > >       
   > > > >     
   > > > > 
   > > > >     
   > > > >   
   > > > > 1st row was part of the table before onboarding to spark-sql.
   > > > > 2nd row was inserted using insert into.
   > > > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > > > 
   > > > 
   > > > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   > > 
   > > 
   > > Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~
   > 
   > @pengzhiwei2018 I tested the patch. I can see the column names are no longer being prefixed. Updates and deletes by record key is working fine now. However., the uri encoding of partition path is still an issue. For example, I did an insert to an existing partition. The insert was successful but it created a new partition as below:
   > 
   > ```
   > insert into hudi_trips_cow values(1.0, 2.0, "driver_2", 3.0, 4.0, 100.0, "rider_2", 12345, "765544i-e89b-12d3-a456-426655440000", "americas/united_states/san_francisco/");
   > 
   > % ls -l /private/tmp/hudi_trips_cow
   > total 0
   > drwxr-xr-x  4 sagars  wheel  128 Aug  4 16:49 americas
   > drwxr-xr-x  6 sagars  wheel  192 Aug  4 16:50 americas%2Funited_states%2Fsan_francisco%2F
   > drwxr-xr-x  3 sagars  wheel   96 Aug  4 16:49 asia
   > ```
   
   Hi @codope , can you drop the table and create again with the latest code of this patch?  I am afraid this happen because you create the table by the old code of the patch. I have fix this issue in the latest .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. 
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was was of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r684350411



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
##########
@@ -301,47 +318,102 @@ case class CreateHoodieTableCommand(table: CatalogTable, ignoreIfExists: Boolean
           s"'${HoodieOptionConfig.SQL_VALUE_TABLE_TYPE_MOR}'")
     }
   }
+
+  private def getAllPartitionPaths(spark: SparkSession, table: CatalogTable): Seq[String] = {
+    val sparkEngine = new HoodieSparkEngineContext(new JavaSparkContext(spark.sparkContext))
+    val metadataConfig = {
+      val properties = new Properties()
+      properties.putAll((spark.sessionState.conf.getAllConfs ++ table.storage.properties).asJava)
+      HoodieMetadataConfig.newBuilder.fromProperties(properties).build()
+    }
+    FSUtils.getAllPartitionPaths(sparkEngine, metadataConfig, getTableLocation(table, spark)).asScala
+  }
+
+  /**
+   * This method is used to compatible with the old non-hive-styled partition table.
+   * By default we enable the "hoodie.datasource.write.hive_style_partitioning"
+   * when writing data to hudi table by spark sql by default.
+   * If the exist table is a non-hive-styled partitioned table, we should
+   * disable the "hoodie.datasource.write.hive_style_partitioning" when
+   * merge or update the table. Or else, we will get an incorrect merge result
+   * as the partition path mismatch.
+   */
+  private def isNotHiveStyledPartitionTable(partitionPaths: Seq[String], table: CatalogTable): Boolean = {
+    if (table.partitionColumnNames.nonEmpty) {
+      val isHiveStylePartitionPath = (path: String) => {
+        val fragments = path.split("/")
+        if (fragments.size != table.partitionColumnNames.size) {
+          false
+        } else {
+          fragments.zip(table.partitionColumnNames).forall {
+            case (pathFragment, partitionColumn) => pathFragment.startsWith(s"$partitionColumn=")
+          }
+        }
+      }
+      !partitionPaths.forall(isHiveStylePartitionPath)
+    } else {
+      false
+    }
+  }
+
+  /**
+   * If this table has disable the url encode, spark sql should also disable it when writing to the table.
+   */
+  private def isUrlEncodeDisable(partitionPaths: Seq[String], table: CatalogTable): Boolean = {
+    if (table.partitionColumnNames.nonEmpty) {
+      !partitionPaths.forall(partitionPath => partitionPath.split("/").length == table.partitionColumnNames.size)
+    } else {
+      false
+    }
+  }
+
 }
 
 object CreateHoodieTableCommand extends Logging {
 
   /**
-    * Init the table if it is not exists.
-    * @param sparkSession
-    * @param table
-    * @return
+    * Init the hoodie.properties.
     */
   def initTableIfNeed(sparkSession: SparkSession, table: CatalogTable): Unit = {
-    val location = getTableLocation(table, sparkSession).getOrElse(
-      throw new IllegalArgumentException(s"Missing location for ${table.identifier}"))
+    val location = getTableLocation(table, sparkSession)
 
     val conf = sparkSession.sessionState.newHadoopConf()
     // Init the hoodie table
-    if (!tableExistsInPath(location, conf)) {
-      val tableName = table.identifier.table
-      logInfo(s"Table $tableName is not exists, start to create the hudi table")
+    val originTableConfig = if (tableExistsInPath(location, conf)) {
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(location)
+        .setConf(conf)
+        .build()
+      metaClient.getTableConfig.getProps.asScala.toMap
+    } else {
+      Map.empty[String, String]
+    }
 
-      // Save all the table config to the hoodie.properties.
-      val parameters = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
-      val properties = new Properties()
+    val tableName = table.identifier.table
+    logInfo(s"Init hoodie.properties for $tableName")
+    val tableOptions = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_RECORDKEY_FIELDS.key)
+    // Save all the table config to the hoodie.properties.
+    val parameters = originTableConfig ++ tableOptions
+    val properties = new Properties()
       properties.putAll(parameters.asJava)
       HoodieTableMetaClient.withPropertyBuilder()
-          .fromProperties(properties)
-          .setTableName(tableName)
-          .setTableCreateSchema(SchemaConverters.toAvroType(table.schema).toString())
-          .setPartitionFields(table.partitionColumnNames.mkString(","))
-          .initTable(conf, location)
-    }
+        .fromProperties(properties)
+        .setTableName(tableName)
+        .setTableCreateSchema(SchemaConverters.toAvroType(table.schema).toString())
+        .setPartitionFields(table.partitionColumnNames.mkString(","))

Review comment:
       don't we need to set record key fields here? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 53186a44289eb1740d15d0b35afc9d9f3c9f5904 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 797b288a393089f2c09f7013ac5b378d8946bee9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452) 
   * a575eba527237c43ec6b150e105005adfce07f80 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0af1f7f32ef4135c18336a91532b90a38a227cc2 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366) 
   * b75f709b3759b283822cc55f8c3030bc57fc4122 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was part of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r684574139



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
##########
@@ -301,47 +318,102 @@ case class CreateHoodieTableCommand(table: CatalogTable, ignoreIfExists: Boolean
           s"'${HoodieOptionConfig.SQL_VALUE_TABLE_TYPE_MOR}'")
     }
   }
+
+  private def getAllPartitionPaths(spark: SparkSession, table: CatalogTable): Seq[String] = {
+    val sparkEngine = new HoodieSparkEngineContext(new JavaSparkContext(spark.sparkContext))
+    val metadataConfig = {
+      val properties = new Properties()
+      properties.putAll((spark.sessionState.conf.getAllConfs ++ table.storage.properties).asJava)
+      HoodieMetadataConfig.newBuilder.fromProperties(properties).build()
+    }
+    FSUtils.getAllPartitionPaths(sparkEngine, metadataConfig, getTableLocation(table, spark)).asScala
+  }
+
+  /**
+   * This method is used to compatible with the old non-hive-styled partition table.
+   * By default we enable the "hoodie.datasource.write.hive_style_partitioning"
+   * when writing data to hudi table by spark sql by default.
+   * If the exist table is a non-hive-styled partitioned table, we should
+   * disable the "hoodie.datasource.write.hive_style_partitioning" when
+   * merge or update the table. Or else, we will get an incorrect merge result
+   * as the partition path mismatch.
+   */
+  private def isNotHiveStyledPartitionTable(partitionPaths: Seq[String], table: CatalogTable): Boolean = {
+    if (table.partitionColumnNames.nonEmpty) {
+      val isHiveStylePartitionPath = (path: String) => {
+        val fragments = path.split("/")
+        if (fragments.size != table.partitionColumnNames.size) {
+          false
+        } else {
+          fragments.zip(table.partitionColumnNames).forall {
+            case (pathFragment, partitionColumn) => pathFragment.startsWith(s"$partitionColumn=")
+          }
+        }
+      }
+      !partitionPaths.forall(isHiveStylePartitionPath)
+    } else {
+      false
+    }
+  }
+
+  /**
+   * If this table has disable the url encode, spark sql should also disable it when writing to the table.
+   */
+  private def isUrlEncodeDisable(partitionPaths: Seq[String], table: CatalogTable): Boolean = {
+    if (table.partitionColumnNames.nonEmpty) {
+      !partitionPaths.forall(partitionPath => partitionPath.split("/").length == table.partitionColumnNames.size)
+    } else {
+      false
+    }
+  }
+
 }
 
 object CreateHoodieTableCommand extends Logging {
 
   /**
-    * Init the table if it is not exists.
-    * @param sparkSession
-    * @param table
-    * @return
+    * Init the hoodie.properties.
     */
   def initTableIfNeed(sparkSession: SparkSession, table: CatalogTable): Unit = {
-    val location = getTableLocation(table, sparkSession).getOrElse(
-      throw new IllegalArgumentException(s"Missing location for ${table.identifier}"))
+    val location = getTableLocation(table, sparkSession)
 
     val conf = sparkSession.sessionState.newHadoopConf()
     // Init the hoodie table
-    if (!tableExistsInPath(location, conf)) {
-      val tableName = table.identifier.table
-      logInfo(s"Table $tableName is not exists, start to create the hudi table")
+    val originTableConfig = if (tableExistsInPath(location, conf)) {
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(location)
+        .setConf(conf)
+        .build()
+      metaClient.getTableConfig.getProps.asScala.toMap
+    } else {
+      Map.empty[String, String]
+    }
 
-      // Save all the table config to the hoodie.properties.
-      val parameters = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
-      val properties = new Properties()
+    val tableName = table.identifier.table
+    logInfo(s"Init hoodie.properties for $tableName")
+    val tableOptions = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_RECORDKEY_FIELDS.key)
+    // Save all the table config to the hoodie.properties.
+    val parameters = originTableConfig ++ tableOptions
+    val properties = new Properties()
       properties.putAll(parameters.asJava)
       HoodieTableMetaClient.withPropertyBuilder()
-          .fromProperties(properties)
-          .setTableName(tableName)
-          .setTableCreateSchema(SchemaConverters.toAvroType(table.schema).toString())
-          .setPartitionFields(table.partitionColumnNames.mkString(","))
-          .initTable(conf, location)
-    }
+        .fromProperties(properties)
+        .setTableName(tableName)
+        .setTableCreateSchema(SchemaConverters.toAvroType(table.schema).toString())
+        .setPartitionFields(table.partitionColumnNames.mkString(","))

Review comment:
       It has set in `fromProperties`. Currently the config in the table options will be set by `fromProperties` method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892777427


   > > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > > ```
   > > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > > ```
   > > > 
   > > > 
   > > >     
   > > >       
   > > >     
   > > > 
   > > >       
   > > >     
   > > > 
   > > >     
   > > >   
   > > > 1st row was part of the table before onboarding to spark-sql.
   > > > 2nd row was inserted using insert into.
   > > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > > 
   > > 
   > > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   > 
   > Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~
   
   @pengzhiwei2018 I tested the patch. I can see the column names are no longer being prefixed. Updates and deletes by record key is working fine now. However., the uri encoding of partition path is still an issue. For example, I did an insert to an existing partition. The insert was successful but it created a new partition as below:
   ```
   insert into hudi_trips_cow values(1.0, 2.0, "driver_2", 3.0, 4.0, 100.0, "rider_2", 12345, "765544i-e89b-12d3-a456-426655440000", "americas/united_states/san_francisco/");
   
   % ls -l /private/tmp/hudi_trips_cow
   total 0
   drwxr-xr-x  4 sagars  wheel  128 Aug  4 16:49 americas
   drwxr-xr-x  6 sagars  wheel  192 Aug  4 16:50 americas%2Funited_states%2Fsan_francisco%2F
   drwxr-xr-x  3 sagars  wheel   96 Aug  4 16:49 asia
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. 
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	**2019-01-01 00:04:03**	**2019-01-01**	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	**tpep_pickup_datetime:2021-01-01 00:04:03** 	**date_col=2021-01-01**	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was was of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892526491






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3aec37c6f238f4ed40672d20f7b7c6e6314a9519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350) 
   * b87e99ee2121f013b659f09d9a1415c6e2ca3868 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 797b288a393089f2c09f7013ac5b378d8946bee9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452) 
   * a575eba527237c43ec6b150e105005adfce07f80 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-892140145


   Hey peng. I did a round of testing on this patch. Here are my findings.
   
   Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   
   ```
   select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   Time taken: 0.524 seconds, Fetched 2 row(s)
   ```
   1st row was part of the table before onboarding to spark-sql. 
   2nd row was inserted using insert into. 
   
   
   Hi @nsivabalan , I know the difference now. The spark sql use the `ComplexKeyGenerator` to generated record key which will add the column name to the record key.  While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same  for `ComplexKeyGenerator` and `SimpleKeyGenerator`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1444",
       "triggerID" : "894356017",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1445",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1453",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "797b288a393089f2c09f7013ac5b378d8946bee9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1452",
       "triggerID" : "894595802",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a575eba527237c43ec6b150e105005adfce07f80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1454",
       "triggerID" : "a575eba527237c43ec6b150e105005adfce07f80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "894603173",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "triggerType" : "PUSH"
     }, {
       "hash" : "caa79aef468f6e82de2c84e7b35975326c42b626",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "894607773",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * caa79aef468f6e82de2c84e7b35975326c42b626 UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 314a8f66727958ac7830c9d82e8a7ea97be74900 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394) 
   * 921d21fc6732fe51296e21fd3a26fa11c16bfca3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-893180101


   > > > > Hey peng. I did a round of testing on this patch. Here are my findings.
   > > > > Insert into is till prefixing col name to meta fields. (3rd col and 4th col)
   > > > > ```
   > > > > select * from hudi_ny where tpep_pickup_datetime like '%00:04:03%';
   > > > > 20210802105420	20210802105420_2_23	2019-01-01 00:04:03	2019-01-01	c5e6a617-dfc5-4051-8c1a-8daead3847af-0_2-37-62_20210802105420.parquet	2	2019-01-01 00:04:03	2019-01-01 00:11:48	1	3.01	1	N	137	262	1	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2019-01-01
   > > > > 20210803162030	20210803162030_0_1	tpep_pickup_datetime:2021-01-01 00:04:03 	date_col=2021-01-01	c5c72f9e-9a63-48ca-a981-4302890f5210-0_0-27-1635_20210803162030.parquet	2	2021-01-01 00:04:03	2021-01-01 00:11:48	1	3.01	1	N	137	262	10.0	0.5	0.5	2.26	0.0	0.3	13.56	NULL	2021-01-01
   > > > > Time taken: 0.524 seconds, Fetched 2 row(s)
   > > > > ```
   > > > > 
   > > > > 
   > > > >     
   > > > >       
   > > > >     
   > > > > 
   > > > >       
   > > > >     
   > > > > 
   > > > >     
   > > > >   
   > > > > 1st row was part of the table before onboarding to spark-sql.
   > > > > 2nd row was inserted using insert into.
   > > > > Hi @nsivabalan , I know the difference now. The spark sql use the `SqlKeyGenerator` which is a sub-class of `ComplexKeyGenerator` to generated record key which will add the column name to the record key. While the `SimpleKeyGenerator` will not do that. So we should keep the behavior the same for `ComplexKeyGenerator` and `SimpleKeyGenerator`.
   > > > 
   > > > 
   > > > sorry, I don't get you. I understand SqlKeyGenerator extends from ComplexKeyGen. but why do we need to keep the same for SimpleKeyGen? We should not add any field prefix for SimpleKeyGen. If not, no updates will work for an existing table.
   > > 
   > > 
   > > Hi @nsivabalan , Have solved the record key not matched issue. Please take a test again~
   > 
   > @pengzhiwei2018 I tested the patch. I can see the column names are no longer being prefixed. Updates and deletes by record key is working fine now. However., the uri encoding of partition path is still an issue. For example, I did an insert to an existing partition. The insert was successful but it created a new partition as below:
   > 
   > ```
   > insert into hudi_trips_cow values(1.0, 2.0, "driver_2", 3.0, 4.0, 100.0, "rider_2", 12345, "765544i-e89b-12d3-a456-426655440000", "americas/united_states/san_francisco/");
   > 
   > % ls -l /private/tmp/hudi_trips_cow
   > total 0
   > drwxr-xr-x  4 sagars  wheel  128 Aug  4 16:49 americas
   > drwxr-xr-x  6 sagars  wheel  192 Aug  4 16:50 americas%2Funited_states%2Fsan_francisco%2F
   > drwxr-xr-x  3 sagars  wheel   96 Aug  4 16:49 asia
   > ```
   
   Hi @codope , can you drop the table and create again with the latest code of this patch. I am afraid this happen because you create the table by the old code of the patch. I have fix this issue in the latest .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     }, {
       "hash" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "56a10a9cee28560d7a80d1abc5f8e15a463ddd87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404) 
   * ee788e31ffff71eb1e684427b4b10f2604385851 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1441) 
   * 56a10a9cee28560d7a80d1abc5f8e15a463ddd87 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1399",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404",
       "triggerID" : "6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ee788e31ffff71eb1e684427b4b10f2604385851",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6952cc701cda4b7f1f7fde9896d2bce0b2a4b66b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1404) 
   * ee788e31ffff71eb1e684427b4b10f2604385851 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
pengzhiwei2018 commented on a change in pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#discussion_r682236141



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
##########
@@ -306,34 +299,49 @@ case class CreateHoodieTableCommand(table: CatalogTable, ignoreIfExists: Boolean
 object CreateHoodieTableCommand extends Logging {
 
   /**
-    * Init the table if it is not exists.
-    * @param sparkSession
-    * @param table
-    * @return
+    * Init the hoodie.properties.
     */
   def initTableIfNeed(sparkSession: SparkSession, table: CatalogTable): Unit = {
     val location = getTableLocation(table, sparkSession).getOrElse(
       throw new IllegalArgumentException(s"Missing location for ${table.identifier}"))
 
     val conf = sparkSession.sessionState.newHadoopConf()
     // Init the hoodie table
-    if (!tableExistsInPath(location, conf)) {
-      val tableName = table.identifier.table
-      logInfo(s"Table $tableName is not exists, start to create the hudi table")
+    val originTableConfig = if (tableExistsInPath(location, conf)) {
+      val metaClient = HoodieTableMetaClient.builder()
+        .setBasePath(location)
+        .setConf(conf)
+        .build()
+      metaClient.getTableConfig.getProps.asScala.toMap
+    } else {
+      Map.empty[String, String]
+    }
 
-      // Save all the table config to the hoodie.properties.
-      val parameters = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
-      val properties = new Properties()
+    val tableName = table.identifier.table
+    logInfo(s"Init hoodie.properties for $tableName")
+    val tableOptions = HoodieOptionConfig.mappingSqlOptionToTableConfig(table.storage.properties)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PRECOMBINE_FIELD_PROP.key)
+    checkTableConfigEqual(originTableConfig, tableOptions, HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key)

Review comment:
       Good catch! will add the check for record key




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3393: [HUDI-1842] Spark Sql Support For The Exists Hoodie Table

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3393:
URL: https://github.com/apache/hudi/pull/3393#issuecomment-891812944


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1331",
       "triggerID" : "53186a44289eb1740d15d0b35afc9d9f3c9f5904",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1350",
       "triggerID" : "3aec37c6f238f4ed40672d20f7b7c6e6314a9519",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1361",
       "triggerID" : "b87e99ee2121f013b659f09d9a1415c6e2ca3868",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1366",
       "triggerID" : "0af1f7f32ef4135c18336a91532b90a38a227cc2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1367",
       "triggerID" : "b75f709b3759b283822cc55f8c3030bc57fc4122",
       "triggerType" : "PUSH"
     }, {
       "hash" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394",
       "triggerID" : "314a8f66727958ac7830c9d82e8a7ea97be74900",
       "triggerType" : "PUSH"
     }, {
       "hash" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "921d21fc6732fe51296e21fd3a26fa11c16bfca3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 314a8f66727958ac7830c9d82e8a7ea97be74900 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1394) 
   * 921d21fc6732fe51296e21fd3a26fa11c16bfca3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org