You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/17 11:47:43 UTC

[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1520: [HUDI-797] Small performance improvement for rewriting records.

pratyakshsharma commented on a change in pull request #1520: [HUDI-797] Small performance improvement for rewriting records.
URL: https://github.com/apache/incubator-hudi/pull/1520#discussion_r410170533
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
 ##########
 @@ -231,6 +254,36 @@ private static GenericRecord rewrite(GenericRecord record, LinkedHashSet<Field>
     return allFields;
   }
 
+  /*
+   * Given a avro record with a given schema, rewrites it into the new schema while setting fields only from the old
+   * schema.
+   *
+   * NOTE: This function is only suitable if newSchema has fields with the same position as record's schema.
+   */
+  public static GenericRecord rewriteHoodieRecord(GenericRecord record, Schema newSchema) {
+    return rewriteHoodieRecord(record, record.getSchema(), newSchema);
+  }
+
+  /**
+   * Given a avro record with a given schema, rewrites it into the new schema while setting fields only from the old
+   * schema.
+   *
+   * This function has better performance than rewrite() even though it provides the same functionality.
+   *
+   * NOTE: This function is only suitable if newSchema has fields with the same position as schemaWithFields.
+   */
+  public static GenericRecord rewriteHoodieRecord(GenericRecord record, Schema schemaWithFields, Schema newSchema) {
+    GenericRecord newRecord = new GenericData.Record(newSchema);
+    for (Schema.Field f : schemaWithFields.getFields()) {
+      newRecord.put(f.pos(), record.get(f.pos()));
 
 Review comment:
   Please take care of handling default values here. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services