You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/01 10:24:03 UTC

[GitHub] [hudi] TJX2014 opened a new pull request, #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

TJX2014 opened a new pull request, #6567:
URL: https://github.com/apache/hudi/pull/6567

   ### Change Logs
   Add null or empty judge in setupHoodieKeyOptions of org.apache.hudi.table.HoodieTableFactory class
   
   ### Impact
   User configure KEYGEN_CLASS_NAME of hudi-flink module will take effect.
   
   **Risk level: none | low | medium | high**
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1235145859

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11112",
       "triggerID" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 880dd1f70c863174942bb7aa52d069533705af4b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087) 
   * c5671fd919dbf3ac554dc17156f596559d9e360f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11112) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1234231717

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 880dd1f70c863174942bb7aa52d069533705af4b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962542029


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hudi configure keygen clazz auto is great, so the option should not exists, once configured but not effect, is it strange?The code in spark has changed to follow hudi-partition way, but in historical data, if the layout of non-partitioned table with complex key by spark, the only chance for hudi-flink it to configure keygen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962451609


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   But user cannot assign keygen_class seems not friendly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962542029


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hudi configure keygen clazz auto is great, so the option should not exists, once configured but not effect, is it strange?The code in spark has changed to follow hudi-partition way, but in historical data, if the layout of non-partitioned table with complex key by spark, the only chance for hudi-flink it to configure keygen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1235141754

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 880dd1f70c863174942bb7aa52d069533705af4b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087) 
   * c5671fd919dbf3ac554dc17156f596559d9e360f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962567518


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   That's true, but we better use the right key gen clazz for better performance



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1234086467

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 880dd1f70c863174942bb7aa52d069533705af4b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962087299


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   In https://github.com/apache/hudi/pull/5815, we have fixed the spark sql to use `NonpartitionedKeyGenerator` for non partitioned table.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r961198951


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hi, what kind of keygen clazz do you want to configure for non-partitioned table then ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962624239


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Of course, shorter key will gain better performance, but this option should also take effect, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962542972


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hudi configure keygen clazz auto is great, so the option should not exists, once configured but not effect, is it strange?The code in spark has changed to follow hudi-partition way, but in historical data, if the layout of non-partitioned table with complex key by spark, the only chance for hudi-flink is to configure keygen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962542972


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hudi configure keygen clazz auto is great, so the option should not exists, once configured but not effect, is it strange?The code in spark has changed to follow hudi-partition way, but in historical data, if the layout of non-partitioned table with complex key by spark, the only chance for hudi-flink it to configure keygen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1234080742

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 880dd1f70c863174942bb7aa52d069533705af4b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r961345056


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Hi @danny0405 I want to configure org.apache.hudi.keygen.ComplexAvroKeyGenerator for non partition in hudi-flink side, which not only follow partitions, because in spark side, use complex key as default, but flink can not assign complex for non partition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6567:
URL: https://github.com/apache/hudi/pull/6567#issuecomment-1235328761

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11087",
       "triggerID" : "880dd1f70c863174942bb7aa52d069533705af4b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11112",
       "triggerID" : "c5671fd919dbf3ac554dc17156f596559d9e360f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c5671fd919dbf3ac554dc17156f596559d9e360f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11112) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6567: [HUDI-4767] Fix non partition table in hudi-flink ignore KEYGEN_CLASS…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6567:
URL: https://github.com/apache/hudi/pull/6567#discussion_r962456453


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -217,31 +217,33 @@ private static void setupHoodieKeyOptions(Configuration conf, CatalogTable table
       }
     }
 
-    // tweak the key gen class if possible
-    final String[] partitions = conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
-    final String[] pks = conf.getString(FlinkOptions.RECORD_KEY_FIELD).split(",");
-    if (partitions.length == 1) {
-      final String partitionField = partitions[0];
-      if (partitionField.isEmpty()) {
-        conf.setString(FlinkOptions.KEYGEN_CLASS_NAME, NonpartitionedAvroKeyGenerator.class.getName());
-        LOG.info("Table option [{}] is reset to {} because this is a non-partitioned table",
-            FlinkOptions.KEYGEN_CLASS_NAME.key(), NonpartitionedAvroKeyGenerator.class.getName());
-        return;
+    if (StringUtils.isNullOrEmpty(conf.get(FlinkOptions.KEYGEN_CLASS_NAME))) {
+      // tweak the key gen class if possible

Review Comment:
   Generally that's true, but non-partitioned table is a special case and hudi configure the keygen clazz transparently for user.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org