You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "nsivabalan (via GitHub)" <gi...@apache.org> on 2023/03/17 21:17:05 UTC

[GitHub] [hudi] nsivabalan commented on a diff in pull request #8176: [HUDI-5929] Automatically infer key generator type

nsivabalan commented on code in PR #8176:
URL: https://github.com/apache/hudi/pull/8176#discussion_r1140709444


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/factory/HoodieSparkKeyGeneratorFactory.java:
##########
@@ -75,40 +79,60 @@ public static KeyGenerator createKeyGenerator(TypedProperties props) throws IOEx
     }
   }
 
+  /**
+   * @param type {@link KeyGeneratorType} enum.
+   * @return The key generator class name for Spark based on the {@link KeyGeneratorType}.
+   */
+  public static String getKeyGeneratorClassNameFromType(KeyGeneratorType type) {
+    switch (type) {
+      case SIMPLE:
+        return SimpleKeyGenerator.class.getName();
+      case COMPLEX:
+        return ComplexKeyGenerator.class.getName();
+      case TIMESTAMP:
+        return TimestampBasedKeyGenerator.class.getName();
+      case CUSTOM:
+        return CustomKeyGenerator.class.getName();
+      case NON_PARTITION:
+        return NonpartitionedKeyGenerator.class.getName();
+      case GLOBAL_DELETE:
+        return GlobalDeleteKeyGenerator.class.getName();
+      default:
+        throw new HoodieKeyGeneratorException("Unsupported keyGenerator Type " + type);
+    }
+  }
+
+  /**
+   * Infers the key generator type based on the record key and partition fields.
+   * If neither of the record key and partition fields are set, the default type is returned.
+   *
+   * @param props Properties from the write config.
+   * @return Inferred key generator type.
+   */
+  public static KeyGeneratorType inferKeyGeneratorTypeFromWriteConfig(TypedProperties props) {

Review Comment:
   is this required to be public ? 



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala:
##########
@@ -300,7 +302,7 @@ class HoodieCatalogTable(val spark: SparkSession, var table: CatalogTable) exten
       val primaryKeys = table.properties.getOrElse(SQL_KEY_TABLE_PRIMARY_KEY.sqlKeyName, table.storage.properties.get(SQL_KEY_TABLE_PRIMARY_KEY.sqlKeyName)).toString
       val partitions = table.partitionColumnNames.mkString(",")
       extraConfig(HoodieTableConfig.KEY_GENERATOR_CLASS_NAME.key) =
-        DataSourceOptionsHelper.inferKeyGenClazz(primaryKeys, partitions)
+        getKeyGeneratorClassNameFromType(inferKeyGeneratorType(primaryKeys, partitions))

Review Comment:
   should we introduce InferKeyGenClassfromProps() and internally we can call inferKeyGeneratorType()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org