You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/06 12:33:16 UTC

[GitHub] [hudi] codope commented on a diff in pull request #5747: [HUDI-4171] Fixing Non partitioned with virtual keys in read path

codope commented on code in PR #5747:
URL: https://github.com/apache/hudi/pull/5747#discussion_r890086526


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##########
@@ -103,10 +103,17 @@ public static HoodieTableMetaClient init(Configuration hadoopConf, String basePa
   }
 
   public static HoodieTableMetaClient init(Configuration hadoopConf, String basePath, HoodieTableType tableType,
-                                           HoodieFileFormat baseFileFormat)
+                                           HoodieFileFormat baseFileFormat) throws IOException {
+    return init(hadoopConf, basePath, tableType, baseFileFormat, "org.apache.hudi.keygen.SimpleKeyGenerator", true);
+  }
+
+  public static HoodieTableMetaClient init(Configuration hadoopConf, String basePath, HoodieTableType tableType,
+                                           HoodieFileFormat baseFileFormat, String keyGenerator, boolean populateMetaFields)
       throws IOException {
     Properties properties = new Properties();
     properties.setProperty(HoodieTableConfig.BASE_FILE_FORMAT.key(), baseFileFormat.toString());
+    properties.setProperty("hoodie.datasource.write.keygenerator.class", keyGenerator);

Review Comment:
   let's use the corresponding config keys for these props?



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java:
##########
@@ -107,9 +107,15 @@ default void writeToOutput(DataOutput out) throws IOException {
     } else {
       InputSplitUtils.writeBoolean(true, out);
       InputSplitUtils.writeString(virtualKeyInfoOpt.get().getRecordKeyField(), out);
-      InputSplitUtils.writeString(virtualKeyInfoOpt.get().getPartitionPathField(), out);
+      InputSplitUtils.writeBoolean(virtualKeyInfoOpt.get().getPartitionPathField().isPresent(), out);
+      if (virtualKeyInfoOpt.get().getPartitionPathField().isPresent()) {
+        InputSplitUtils.writeString(virtualKeyInfoOpt.get().getPartitionPathField().get(), out);
+      }
       InputSplitUtils.writeString(String.valueOf(virtualKeyInfoOpt.get().getRecordKeyFieldIndex()), out);
-      InputSplitUtils.writeString(String.valueOf(virtualKeyInfoOpt.get().getPartitionPathFieldIndex()), out);
+      // if partition path field exists, partition path field index should also exists. So, don't need another boolean

Review Comment:
   then we should club the two within single conditional block right?
   ```
   if (virtualKeyInfoOpt.get().getPartitionPathField().isPresent()) {
     InputSplitUtils.writeString(virtualKeyInfoOpt.get().getPartitionPathField().get(), out);
     InputSplitUtils.writeString(String.valueOf(virtualKeyInfoOpt.get().getPartitionPathFieldIndex()), out);
   }
   ```



##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##########
@@ -103,10 +103,17 @@ public static HoodieTableMetaClient init(Configuration hadoopConf, String basePa
   }
 
   public static HoodieTableMetaClient init(Configuration hadoopConf, String basePath, HoodieTableType tableType,
-                                           HoodieFileFormat baseFileFormat)
+                                           HoodieFileFormat baseFileFormat) throws IOException {
+    return init(hadoopConf, basePath, tableType, baseFileFormat, "org.apache.hudi.keygen.SimpleKeyGenerator", true);

Review Comment:
   Isn't `SimpleKeyGenerator` the default keygen?



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java:
##########
@@ -275,16 +276,16 @@ protected static Option<HoodieVirtualKeyInfo> getHoodieVirtualKeyInfo(HoodieTabl
     if (tableConfig.populateMetaFields()) {
       return Option.empty();
     }
-
     TableSchemaResolver tableSchemaResolver = new TableSchemaResolver(metaClient);
     try {
       Schema schema = tableSchemaResolver.getTableAvroSchema();
+      boolean isNonPartitionedKeyGen = StringUtils.isNullOrEmpty(tableConfig.getPartitionFieldProp());

Review Comment:
   Would it be better to check with `tableConfig.getKeyGeneratorClassName()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org