You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "SteNicholas (via GitHub)" <gi...@apache.org> on 2023/02/16 12:05:37 UTC

[GitHub] [hudi] SteNicholas opened a new pull request, #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

SteNicholas opened a new pull request, #7981:
URL: https://github.com/apache/hudi/pull/7981

   ### Change Logs
   
   `HoodieCatalog#getTable` should use `hoodie.datasource.write.recordkey.field` to set the primary key of the table initialized via Spark.
   
   ### Impact
   
   `HoodieCatalog` gets the table with the primary key initialized via Spark.
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [x] CI passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433113948

   @danny0405, I have updated the `TestHoodieCatalog#testGetTable` to verify whether the primay column is correct for `HoodieCatalog#getTable`. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 merged PR #7981:
URL: https://github.com/apache/hudi/pull/7981


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1434601055

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15276",
       "triggerID" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 404a2996cd4a60023f8f77bd3a64b3f96efd3606 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248) 
   * 7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15276) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433094064

   @danny0405, I have added the test case to verify whether the primay column is correct for `HoodieCatalog#getTable`. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7981:
URL: https://github.com/apache/hudi/pull/7981#discussion_r1109316185


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/catalog/TestHoodieCatalog.java:
##########
@@ -294,7 +294,16 @@ public void testGetTable() throws Exception {
     CatalogBaseTable actualTable = catalog.getTable(tablePath);
     // validate schema
     Schema actualSchema = actualTable.getUnresolvedSchema();
-    Schema expectedSchema = Schema.newBuilder().fromResolvedSchema(EXPECTED_TABLE_SCHEMA).build();
+    List<Column> expectedColumns = Arrays.asList(
+        Column.physical("uuid", DataTypes.STRING().notNull()),
+        Column.physical("name", DataTypes.STRING()),
+        Column.physical("age", DataTypes.INT()),
+        Column.physical("tss", DataTypes.TIMESTAMP(3)),
+        Column.physical("partition", DataTypes.STRING())
+    );
+    Schema expectedSchema = Schema.newBuilder()
+        .fromResolvedSchema(new ResolvedSchema(expectedColumns, Collections.emptyList(), CONSTRAINTS))

Review Comment:
   Shouldn't we fix the `EXPECTED_TABLE_SCHEMA` though?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433111283

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245) 
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433194045

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245) 
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 404a2996cd4a60023f8f77bd3a64b3f96efd3606 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1434249711

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 404a2996cd4a60023f8f77bd3a64b3f96efd3606 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433046172

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433120671

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245) 
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433182784

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245) 
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 404a2996cd4a60023f8f77bd3a64b3f96efd3606 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1434587915

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 404a2996cd4a60023f8f77bd3a64b3f96efd3606 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248) 
   * 7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7981:
URL: https://github.com/apache/hudi/pull/7981#discussion_r1108415548


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieCatalog.java:
##########
@@ -250,11 +253,17 @@ public CatalogBaseTable getTable(ObjectPath tablePath) throws TableNotExistExcep
     Map<String, String> options = TableOptionProperties.loadFromProperties(path, hadoopConf);
     final Schema latestSchema = getLatestTableSchema(path);
     if (latestSchema != null) {
+      List<String> pkColumns = TableOptionProperties.getPkColumns(options);
+      // if the table is initialized from spark, the write schema is nullable for pk columns.
+      DataType tableDataType = DataTypeUtils.ensureColumnsAsNonNullable(
+          AvroSchemaConverter.convertToDataType(latestSchema), pkColumns);
       org.apache.flink.table.api.Schema.Builder builder = org.apache.flink.table.api.Schema.newBuilder()
-          .fromRowDataType(AvroSchemaConverter.convertToDataType(latestSchema));
+          .fromRowDataType(tableDataType);
       final String pkConstraintName = TableOptionProperties.getPkConstraintName(options);
-      if (pkConstraintName != null) {
-        builder.primaryKeyNamed(pkConstraintName, TableOptionProperties.getPkColumns(options));
+      if (!StringUtils.isNullOrEmpty(pkConstraintName)) {
+        builder.primaryKeyNamed(pkConstraintName, pkColumns);
+      } else if (!CollectionUtils.isNullOrEmpty(pkColumns)) {

Review Comment:
   Can we add some tests for it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1433035480

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c017b7f18efe30cc7b509f8a5512d6d2550dcac8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas closed pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas closed pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark
URL: https://github.com/apache/hudi/pull/7981


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7981: [HUDI-5058] HoodieCatalog#getTable sets primary key with hoodie.datasource.write.recordkey.field for table initialized via Spark

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7981:
URL: https://github.com/apache/hudi/pull/7981#issuecomment-1435409110

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15245",
       "triggerID" : "c017b7f18efe30cc7b509f8a5512d6d2550dcac8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15248",
       "triggerID" : "404a2996cd4a60023f8f77bd3a64b3f96efd3606",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15276",
       "triggerID" : "7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 87ce9fadd6e8da06b1fd589b48d059ddcf8b6a5c UNKNOWN
   * 7aa0c017fc1974e2f9ff2bf4a9b903d1f1ca5ee6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15276) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org