You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/20 12:32:48 UTC

[GitHub] [hudi] xiarixiaoyao opened a new pull request, #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

xiarixiaoyao opened a new pull request, #5376:
URL: https://github.com/apache/hudi/pull/5376

   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r854776825


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -431,6 +431,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD

Review Comment:
   These records might actually be used upstream in some follow-up operations, hence it's preferred to keep them immutable since at this level we don't control their lifecycle



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103891048

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a59523f9d730bb0b869927829faee66fdfb9491a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174) 
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103948594

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a59523f9d730bb0b869927829faee66fdfb9491a Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174) 
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104689683

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3b28446fbfe17d37baced7f46fd4b7373c1df539 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185) 
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104040658

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855810204


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -719,14 +729,28 @@ public static Object getRecordColumnValues(HoodieRecord<? extends HoodieRecordPa
    *
    * @param oldRecord oldRecord to be rewritten
    * @param newSchema newSchema used to rewrite oldRecord
+   * @param renameCols a map store all rename cols, (k, v)-> (colNameFromNewSchema, colNameFromOldSchema)
    * @return newRecord for new Schema
    */
-  public static GenericRecord rewriteRecordWithNewSchema(IndexedRecord oldRecord, Schema newSchema) {
-    Object newRecord = rewriteRecordWithNewSchema(oldRecord, oldRecord.getSchema(), newSchema);
+  public static GenericRecord rewriteRecordWithNewSchema(IndexedRecord oldRecord, Schema newSchema, Map<String, String> renameCols) {
+    Object newRecord = rewriteRecordWithNewSchema(oldRecord, oldRecord.getSchema(), newSchema, renameCols, new LinkedList<>());

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855614094


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD
+    newRecord.put(HoodieRecord.FILENAME_METADATA_FIELD_POS, fileName);
+    if (!GenericData.get().validate(newSchema, newRecord)) {

Review Comment:
   lets revisit this after 0.11. do not want to drag 0.11 anymore. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105914395

   @nsivabalan @alexeykudinkin  thanks for your review, let me put another pr to optimize the code


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103894308

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a59523f9d730bb0b869927829faee66fdfb9491a Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174) 
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103882319

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a59523f9d730bb0b869927829faee66fdfb9491a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855809728


##########
hudi-common/src/test/java/org/apache/hudi/internal/schema/utils/TestAvroSchemaEvolutionUtils.java:
##########
@@ -284,7 +284,7 @@ public void testReWriteRecordWithTypeChanged() {
         .updateColumnType("col6", Types.StringType.get());
     InternalSchema newSchema = SchemaChangeUtils.applyTableChanges2Schema(internalSchema, updateChange);
     Schema newAvroSchema = AvroInternalSchemaConverter.convert(newSchema, avroSchema.getName());
-    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema);
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema, new HashMap<>());

Review Comment:
   fixed



##########
hudi-common/src/main/java/org/apache/hudi/internal/schema/action/InternalSchemaMerger.java:
##########
@@ -131,12 +150,15 @@ private List<Types.Field> buildRecordType(List<Types.Field> oldFields, List<Type
   private Types.Field dealWithRename(int fieldId, Type newType, Types.Field oldField) {
     Types.Field fieldFromFileSchema = fileSchema.findField(fieldId);
     String nameFromFileSchema = fieldFromFileSchema.name();
+    String nameFromQuerySchema = querySchema.findField(fieldId).name();
     Type typeFromFileSchema = fieldFromFileSchema.type();
     // Current design mechanism guarantees nestedType change is not allowed, so no need to consider.
     if (newType.isNestedType()) {
-      return Types.Field.get(oldField.fieldId(), oldField.isOptional(), nameFromFileSchema, newType, oldField.doc());
+      return Types.Field.get(oldField.fieldId(), oldField.isOptional(),
+          useColNameFromFileSchema ? nameFromFileSchema : nameFromQuerySchema, newType, oldField.doc());

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855301041


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +769,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));

Review Comment:
   donot worry about it。this function is only used by schema evolution,so it will be safe to modify it directly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1106061517

   @alexeykudinkin @nsivabalan  fixed all comments  on https://github.com/apache/hudi/pull/5393/files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855624045


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+          } else {
+            String fieldFullName = createFullName(fieldNames);
+            String[] colNamePartsFromOldSchema = renameCols.getOrDefault(fieldFullName, "").split("\\.");

Review Comment:
   We don't need to split actually, we just need to find the part after the last "." (will reduce amount of memory churn)



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,14 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());

Review Comment:
   @xiarixiaoyao in general instead of doing `new HashMap` let's do `Collections.emptyMap` to avoid allocating any unnecessary objects on the hot-path



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -765,27 +802,41 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
         }
         Collection array = (Collection)oldRecord;
         List<Object> newArray = new ArrayList();
+        fieldNames.push("element");
         for (Object element : array) {
-          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType()));
+          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newArray;
       case MAP:
         if (!(oldRecord instanceof Map)) {
           throw new IllegalArgumentException("cannot rewrite record with different type");
         }
         Map<Object, Object> map = (Map<Object, Object>) oldRecord;
         Map<Object, Object> newMap = new HashMap<>();
+        fieldNames.push("value");

Review Comment:
   Please make all these static constants



##########
hudi-common/src/main/java/org/apache/hudi/internal/schema/action/InternalSchemaMerger.java:
##########
@@ -131,12 +150,15 @@ private List<Types.Field> buildRecordType(List<Types.Field> oldFields, List<Type
   private Types.Field dealWithRename(int fieldId, Type newType, Types.Field oldField) {
     Types.Field fieldFromFileSchema = fileSchema.findField(fieldId);
     String nameFromFileSchema = fieldFromFileSchema.name();
+    String nameFromQuerySchema = querySchema.findField(fieldId).name();
     Type typeFromFileSchema = fieldFromFileSchema.type();
     // Current design mechanism guarantees nestedType change is not allowed, so no need to consider.
     if (newType.isNestedType()) {
-      return Types.Field.get(oldField.fieldId(), oldField.isOptional(), nameFromFileSchema, newType, oldField.doc());
+      return Types.Field.get(oldField.fieldId(), oldField.isOptional(),
+          useColNameFromFileSchema ? nameFromFileSchema : nameFromQuerySchema, newType, oldField.doc());

Review Comment:
   Please inline as a var and reuse



##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java:
##########
@@ -379,7 +380,7 @@ private void processDataBlock(HoodieDataBlock dataBlock, Option<KeySpec> keySpec
       Option<Schema> schemaOption = getMergedSchema(dataBlock);
       while (recordIterator.hasNext()) {
         IndexedRecord currentRecord = recordIterator.next();
-        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get()) : currentRecord;
+        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get(), new HashMap<>()) : currentRecord;

Review Comment:
   Ditto here and everywhere



##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestSpark3DDL.scala:
##########
@@ -445,28 +445,19 @@ class TestSpark3DDL extends TestHoodieSqlBase {
             Seq(null),
             Seq(Map("t1" -> 10.0d))
           )
+          spark.sql(s"alter table ${tableName} rename column members to mem")

Review Comment:
   Let's in addition to these ones add tests for record rewriting utils in `HoodieAvroUtils`



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -719,14 +729,28 @@ public static Object getRecordColumnValues(HoodieRecord<? extends HoodieRecordPa
    *
    * @param oldRecord oldRecord to be rewritten
    * @param newSchema newSchema used to rewrite oldRecord
+   * @param renameCols a map store all rename cols, (k, v)-> (colNameFromNewSchema, colNameFromOldSchema)
    * @return newRecord for new Schema
    */
-  public static GenericRecord rewriteRecordWithNewSchema(IndexedRecord oldRecord, Schema newSchema) {
-    Object newRecord = rewriteRecordWithNewSchema(oldRecord, oldRecord.getSchema(), newSchema);
+  public static GenericRecord rewriteRecordWithNewSchema(IndexedRecord oldRecord, Schema newSchema, Map<String, String> renameCols) {
+    Object newRecord = rewriteRecordWithNewSchema(oldRecord, oldRecord.getSchema(), newSchema, renameCols, new LinkedList<>());

Review Comment:
   Would suggest to use `ArrayDeque` instead (it's more performant than `LinkedList` under most loads)



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+          } else {
+            String fieldFullName = createFullName(fieldNames);
+            String[] colNamePartsFromOldSchema = renameCols.getOrDefault(fieldFullName, "").split("\\.");
+            String lastColNameFromOldSchema = colNamePartsFromOldSchema[colNamePartsFromOldSchema.length - 1];

Review Comment:
   nit: `fieldNameFromOldSchema`



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));

Review Comment:
   Why do we need `helper`? We can just insert into the target record right away, right?



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+          } else {
+            String fieldFullName = createFullName(fieldNames);
+            String[] colNamePartsFromOldSchema = renameCols.getOrDefault(fieldFullName, "").split("\\.");
+            String lastColNameFromOldSchema = colNamePartsFromOldSchema[colNamePartsFromOldSchema.length - 1];
+            // deal with rename
+            if (oldSchema.getField(field.name()) == null && oldSchema.getField(lastColNameFromOldSchema) != null) {
+              // find rename
+              Schema.Field oldField = oldSchema.getField(lastColNameFromOldSchema);
+              helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+            }
           }
+          fieldNames.pop();
         }
         GenericData.Record newRecord = new GenericData.Record(newSchema);
         for (int i = 0; i < fields.size(); i++) {

Review Comment:
   Please check my comment above, we can do this while we iterate over the fields directly to do it in a single loop



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -765,27 +802,41 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
         }
         Collection array = (Collection)oldRecord;
         List<Object> newArray = new ArrayList();
+        fieldNames.push("element");
         for (Object element : array) {
-          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType()));
+          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newArray;
       case MAP:
         if (!(oldRecord instanceof Map)) {
           throw new IllegalArgumentException("cannot rewrite record with different type");
         }
         Map<Object, Object> map = (Map<Object, Object>) oldRecord;
         Map<Object, Object> newMap = new HashMap<>();
+        fieldNames.push("value");
         for (Map.Entry<Object, Object> entry : map.entrySet()) {
-          newMap.put(entry.getKey(), rewriteRecordWithNewSchema(entry.getValue(), oldSchema.getValueType(), newSchema.getValueType()));
+          newMap.put(entry.getKey(), rewriteRecordWithNewSchema(entry.getValue(), oldSchema.getValueType(), newSchema.getValueType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newMap;
       case UNION:
-        return rewriteRecordWithNewSchema(oldRecord, getActualSchemaFromUnion(oldSchema, oldRecord), getActualSchemaFromUnion(newSchema, oldRecord));
+        return rewriteRecordWithNewSchema(oldRecord, getActualSchemaFromUnion(oldSchema, oldRecord), getActualSchemaFromUnion(newSchema, oldRecord), renameCols, fieldNames);
       default:
         return rewritePrimaryType(oldRecord, oldSchema, newSchema);
     }
   }
 
+  private static String createFullName(Deque<String> fieldNames) {
+    String result = "";
+    if (!fieldNames.isEmpty()) {
+      List<String> parentNames = new ArrayList<>();
+      fieldNames.descendingIterator().forEachRemaining(parentNames::add);

Review Comment:
   You don't need additional list, you can just iterate over deque itself



##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java:
##########
@@ -379,7 +380,7 @@ private void processDataBlock(HoodieDataBlock dataBlock, Option<KeySpec> keySpec
       Option<Schema> schemaOption = getMergedSchema(dataBlock);
       while (recordIterator.hasNext()) {
         IndexedRecord currentRecord = recordIterator.next();
-        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get()) : currentRecord;
+        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get(), new HashMap<>()) : currentRecord;

Review Comment:
   Same comment regarding `HashMap`



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {

Review Comment:
   Why not just iterating over `fields` themselves?



##########
hudi-common/src/test/java/org/apache/hudi/internal/schema/utils/TestAvroSchemaEvolutionUtils.java:
##########
@@ -349,7 +349,7 @@ public void testReWriteNestRecord() {
     );
 
     Schema newAvroSchema = AvroInternalSchemaConverter.convert(newRecord, schema.getName());
-    GenericRecord newAvroRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema);
+    GenericRecord newAvroRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema, new HashMap<>());

Review Comment:
   Here as well



##########
hudi-common/src/test/java/org/apache/hudi/internal/schema/utils/TestAvroSchemaEvolutionUtils.java:
##########
@@ -284,7 +284,7 @@ public void testReWriteRecordWithTypeChanged() {
         .updateColumnType("col6", Types.StringType.get());
     InternalSchema newSchema = SchemaChangeUtils.applyTableChanges2Schema(internalSchema, updateChange);
     Schema newAvroSchema = AvroInternalSchemaConverter.convert(newSchema, avroSchema.getName());
-    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema);
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema, new HashMap<>());

Review Comment:
   Same comment as above



##########
hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/InternalSchemaUtils.java:
##########
@@ -267,4 +267,20 @@ public static String createFullName(String name, Deque<String> fieldNames) {
     }
     return result;
   }
+
+  /**
+   * Try to find all renamed cols between oldSchema and newSchema.
+   *
+   * @param oldSchema oldSchema
+   * @param newSchema newSchema which modified from oldSchema
+   * @return renameCols Map. (k, v) -> (colNameFromNewSchema, colNameFromOldSchema)
+   */
+  public static Map<String, String> collectRenameCols(InternalSchema oldSchema, InternalSchema newSchema) {
+    List<String> colNamesFromWriteSchema = oldSchema.getAllColsFullName();
+    return colNamesFromWriteSchema.stream().filter(f -> {
+      int filedIdFromWriteSchema = oldSchema.findIdByName(f);
+      // try to find the cols which has the same id, but have different colName;
+      return newSchema.getAllIds().contains(filedIdFromWriteSchema) && !newSchema.findfullName(filedIdFromWriteSchema).equalsIgnoreCase(f);

Review Comment:
   Instead of duplicating the code just do a `map` first, where you map the name if it's a rename, otherwise return null, then filter all nulls



##########
hudi-common/src/main/java/org/apache/hudi/internal/schema/action/InternalSchemaMerger.java:
##########
@@ -48,6 +48,25 @@ public class InternalSchemaMerger {
   // we can pass decimalType to reWriteRecordWithNewSchema directly, everything is ok.
   private boolean useColumnTypeFromFileSchema = true;
 
+  // deal with rename
+  // Whether to use column name from file schema to read files when we find some column name has changed.
+  // spark parquetReader need the original column name to read data, otherwise the parquetReader will read nothing.
+  // eg: current column name is colOldName, now we rename it to colNewName,
+  // we should not pass colNewName to parquetReader, we must pass colOldName to it; when we read out the data.
+  // for log reader
+  // since our reWriteRecordWithNewSchema function support rewrite directly, so we no need this parameter
+  // eg: current column name is colOldName, now we rename it to colNewName,
+  // we can pass colNewName to reWriteRecordWithNewSchema directly, everything is ok.
+  private boolean useColNameFromFileSchema = true;
+
+  public InternalSchemaMerger(InternalSchema fileSchema, InternalSchema querySchema, boolean ignoreRequiredAttribute, boolean useColumnTypeFromFileSchema, boolean useColNameFromFileSchema) {

Review Comment:
   Please chain ctors (ie one invokes another), to avoid duplication



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105775359

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203",
       "triggerID" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "192b87f1c9af9f85e98b48a6406d6b02ca4f7448",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8213",
       "triggerID" : "192b87f1c9af9f85e98b48a6406d6b02ca4f7448",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cbdc49842258501c1888e5cc247c70fadfde20b7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203) 
   * 192b87f1c9af9f85e98b48a6406d6b02ca4f7448 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8213) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103885249

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a59523f9d730bb0b869927829faee66fdfb9491a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104660472

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176) 
   * 3b28446fbfe17d37baced7f46fd4b7373c1df539 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104665320

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3b28446fbfe17d37baced7f46fd4b7373c1df539 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185) 
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855238436


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD
+    newRecord.put(HoodieRecord.FILENAME_METADATA_FIELD_POS, fileName);
+    if (!GenericData.get().validate(newSchema, newRecord)) {

Review Comment:
   won't this take a perf hit if we validate schema compatability for every record? can't we move this outside and do it once for just one of the record? 



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java:
##########
@@ -109,10 +113,14 @@ public void runMerge(HoodieTable<T, HoodieData<HoodieRecord<T>>, HoodieData<Hood
                       && writeInternalSchema.findIdByName(f) == querySchema.findIdByName(f)
                       && writeInternalSchema.findIdByName(f) != -1
                       && writeInternalSchema.findType(writeInternalSchema.findIdByName(f)).equals(querySchema.findType(writeInternalSchema.findIdByName(f)))).collect(Collectors.toList());
-      readSchema = AvroInternalSchemaConverter.convert(new InternalSchemaMerger(writeInternalSchema, querySchema, true, false).mergeSchema(), readSchema.getName());
+      readSchema = AvroInternalSchemaConverter
+          .convert(new InternalSchemaMerger(writeInternalSchema, querySchema, true, false, false).mergeSchema(), readSchema.getName());
       Schema writeSchemaFromFile = AvroInternalSchemaConverter.convert(writeInternalSchema, readSchema.getName());
       needToReWriteRecord = sameCols.size() != colNamesFromWriteSchema.size()
               || SchemaCompatibility.checkReaderWriterCompatibility(writeSchemaFromFile, readSchema).getType() == org.apache.avro.SchemaCompatibility.SchemaCompatibilityType.COMPATIBLE;
+      if (needToReWriteRecord) {

Review Comment:
   not exactly related to this patch. but in L120, are we passing the arguements in right order? from the docs(SchemaCompatibility.checkReaderWriterCompatibility()), first arg refers to reader schema and 2nd arg refers to writer schema. 
   can you check that once please? 



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -431,6 +431,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD

Review Comment:
   yes, lets revisit after 0.11 to see if we can avoid full rewrite in some cases. I understand the intent to recreate new record to avoid mutations, but it does incur perf hits. I should have thought about this when we fixed the HoodieMergeHandle for commit time fix in earlier patch. missed to bring it up. 
   



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +769,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));

Review Comment:
   I am bit skeptical on making such changes at this stage of the release. Can we add a new method when renameCols is not empty and not touch existing code (if schema evol is not enabled). Atleast I want to make sure we don't cause any unintentional regression for non schema evol code path. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r854359454


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -743,7 +757,20 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
           Schema.Field field = fields.get(i);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols));
+          }
+          // deal with rename
+          if (!renameCols.isEmpty() && oldSchema.getField(field.name()) == null) {
+            String fieldName = field.name();
+            for (Map.Entry<String, String> entry : renameCols.entrySet()) {
+              List<String> nameParts = Arrays.asList(entry.getKey().split("\\."));

Review Comment:
   If we're accepting the dot-path specification, why are we looking at the last element of the chain? 
   Since we're doing top-down traversal, we should look at the first and make sure we're also need to either to keep an index at what level we are, or trim the column names as we traverse



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -431,6 +431,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD

Review Comment:
   Records should be immutable by default, with only limited scopes where treating them as mutable is acceptable



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -98,6 +101,8 @@
   protected final String fileId;
   protected final String writeToken;
   protected final TaskContextSupplier taskContextSupplier;
+  // For full schema evolution
+  protected final boolean schemaOnReadEnable;

Review Comment:
   nit: better to suffix it w/ `Enabled`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104182893

   @xiarixiaoyao left some comments. 
   
   Can you please add a description for this PR? There's very little context in this PR itself, but also not a lot of in the Jira issue, so hard to understand how exactly HUDI-3855 is related to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855810004


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -765,27 +802,41 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
         }
         Collection array = (Collection)oldRecord;
         List<Object> newArray = new ArrayList();
+        fieldNames.push("element");
         for (Object element : array) {
-          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType()));
+          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newArray;
       case MAP:
         if (!(oldRecord instanceof Map)) {
           throw new IllegalArgumentException("cannot rewrite record with different type");
         }
         Map<Object, Object> map = (Map<Object, Object>) oldRecord;
         Map<Object, Object> newMap = new HashMap<>();
+        fieldNames.push("value");

Review Comment:
   fixed



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +765,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+          } else {
+            String fieldFullName = createFullName(fieldNames);
+            String[] colNamePartsFromOldSchema = renameCols.getOrDefault(fieldFullName, "").split("\\.");
+            String lastColNameFromOldSchema = colNamePartsFromOldSchema[colNamePartsFromOldSchema.length - 1];
+            // deal with rename
+            if (oldSchema.getField(field.name()) == null && oldSchema.getField(lastColNameFromOldSchema) != null) {
+              // find rename
+              Schema.Field oldField = oldSchema.getField(lastColNameFromOldSchema);
+              helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));
+            }
           }
+          fieldNames.pop();
         }
         GenericData.Record newRecord = new GenericData.Record(newSchema);
         for (int i = 0; i < fields.size(); i++) {

Review Comment:
   yes , already rework those codes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105426472

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187) 
   * cbdc49842258501c1888e5cc247c70fadfde20b7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855303296


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java:
##########
@@ -109,10 +113,14 @@ public void runMerge(HoodieTable<T, HoodieData<HoodieRecord<T>>, HoodieData<Hood
                       && writeInternalSchema.findIdByName(f) == querySchema.findIdByName(f)
                       && writeInternalSchema.findIdByName(f) != -1
                       && writeInternalSchema.findType(writeInternalSchema.findIdByName(f)).equals(querySchema.findType(writeInternalSchema.findIdByName(f)))).collect(Collectors.toList());
-      readSchema = AvroInternalSchemaConverter.convert(new InternalSchemaMerger(writeInternalSchema, querySchema, true, false).mergeSchema(), readSchema.getName());
+      readSchema = AvroInternalSchemaConverter
+          .convert(new InternalSchemaMerger(writeInternalSchema, querySchema, true, false, false).mergeSchema(), readSchema.getName());
       Schema writeSchemaFromFile = AvroInternalSchemaConverter.convert(writeInternalSchema, readSchema.getName());
       needToReWriteRecord = sameCols.size() != colNamesFromWriteSchema.size()
               || SchemaCompatibility.checkReaderWriterCompatibility(writeSchemaFromFile, readSchema).getType() == org.apache.avro.SchemaCompatibility.SchemaCompatibilityType.COMPATIBLE;
+      if (needToReWriteRecord) {

Review Comment:
   thanks for your remind,let me fixed it
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105772333

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203",
       "triggerID" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "192b87f1c9af9f85e98b48a6406d6b02ca4f7448",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "192b87f1c9af9f85e98b48a6406d6b02ca4f7448",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cbdc49842258501c1888e5cc247c70fadfde20b7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203) 
   * 192b87f1c9af9f85e98b48a6406d6b02ca4f7448 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan merged pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
nsivabalan merged PR #5376:
URL: https://github.com/apache/hudi/pull/5376


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104659123

   @alexeykudinkin 
   Thank you very much for your review, addressed all comments
   add more test for nested rename operation.
   
   by HUDI-3855: we will rewrite old record before write it to parquet file
   for schema evolution rename  scene, since old parquet file has old name, when we rewrite the old record with new schema, the value belong to old name will be missed  which lead to a serious problem
   for example;
   1)now current cow hoodie table have a old parquet file which schema is: a int, b string
   2) we rename  a -> aa,  now new schema for hoodie table  will be :  aa int, b string
   3) let us insert new data to current hoodie table,  during the insert operation we need to read old record from old parquet file,
   **before HUDI-3855**:  we can read old record directly and write it to new parquet directly, rename operation has nothing influence to it
   **after HUDI-3855**: before we write old record, we need rewrite it with new schema,  now the schema of old record is: a int, b string but the new schema is: aa int, b string,  if we rewrite the old record forcely we will miss the value of column a since it is not exists in new schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855809877


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java:
##########
@@ -379,7 +380,7 @@ private void processDataBlock(HoodieDataBlock dataBlock, Option<KeySpec> keySpec
       Option<Schema> schemaOption = getMergedSchema(dataBlock);
       while (recordIterator.hasNext()) {
         IndexedRecord currentRecord = recordIterator.next();
-        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get()) : currentRecord;
+        IndexedRecord record = schemaOption.isPresent() ? HoodieAvroUtils.rewriteRecordWithNewSchema(currentRecord, schemaOption.get(), new HashMap<>()) : currentRecord;

Review Comment:
   yes, fixed



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -765,27 +802,41 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
         }
         Collection array = (Collection)oldRecord;
         List<Object> newArray = new ArrayList();
+        fieldNames.push("element");
         for (Object element : array) {
-          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType()));
+          newArray.add(rewriteRecordWithNewSchema(element, oldSchema.getElementType(), newSchema.getElementType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newArray;
       case MAP:
         if (!(oldRecord instanceof Map)) {
           throw new IllegalArgumentException("cannot rewrite record with different type");
         }
         Map<Object, Object> map = (Map<Object, Object>) oldRecord;
         Map<Object, Object> newMap = new HashMap<>();
+        fieldNames.push("value");
         for (Map.Entry<Object, Object> entry : map.entrySet()) {
-          newMap.put(entry.getKey(), rewriteRecordWithNewSchema(entry.getValue(), oldSchema.getValueType(), newSchema.getValueType()));
+          newMap.put(entry.getKey(), rewriteRecordWithNewSchema(entry.getValue(), oldSchema.getValueType(), newSchema.getValueType(), renameCols, fieldNames));
         }
+        fieldNames.pop();
         return newMap;
       case UNION:
-        return rewriteRecordWithNewSchema(oldRecord, getActualSchemaFromUnion(oldSchema, oldRecord), getActualSchemaFromUnion(newSchema, oldRecord));
+        return rewriteRecordWithNewSchema(oldRecord, getActualSchemaFromUnion(oldSchema, oldRecord), getActualSchemaFromUnion(newSchema, oldRecord), renameCols, fieldNames);
       default:
         return rewritePrimaryType(oldRecord, oldSchema, newSchema);
     }
   }
 
+  private static String createFullName(Deque<String> fieldNames) {
+    String result = "";
+    if (!fieldNames.isEmpty()) {
+      List<String> parentNames = new ArrayList<>();
+      fieldNames.descendingIterator().forEachRemaining(parentNames::add);

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855810475


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,14 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());

Review Comment:
   fixed



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,14 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855809616


##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestSpark3DDL.scala:
##########
@@ -445,28 +445,19 @@ class TestSpark3DDL extends TestHoodieSqlBase {
             Seq(null),
             Seq(Map("t1" -> 10.0d))
           )
+          spark.sql(s"alter table ${tableName} rename column members to mem")

Review Comment:
   thanks, already added



##########
hudi-common/src/test/java/org/apache/hudi/internal/schema/utils/TestAvroSchemaEvolutionUtils.java:
##########
@@ -349,7 +349,7 @@ public void testReWriteNestRecord() {
     );
 
     Schema newAvroSchema = AvroInternalSchemaConverter.convert(newRecord, schema.getName());
-    GenericRecord newAvroRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema);
+    GenericRecord newAvroRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(avroRecord, newAvroSchema, new HashMap<>());

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855310604


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -405,6 +407,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD
+    newRecord.put(HoodieRecord.FILENAME_METADATA_FIELD_POS, fileName);
+    if (!GenericData.get().validate(newSchema, newRecord)) {

Review Comment:
   this function is copy from rewriteRecordWithMetadata,only used for schema evolution。maybe we can remove this check diectly,we need also do the same things for rewriteRecordWithMetadata



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105430123

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203",
       "triggerID" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187) 
   * cbdc49842258501c1888e5cc247c70fadfde20b7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1103906742

   @alexeykudinkin @bvaradar @xushiyan 
   could you pls help me review this pr?  thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104662018

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176) 
   * 3b28446fbfe17d37baced7f46fd4b7373c1df539 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104663698

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cc64da84133155264c4c56e4fe3341a82d6b915 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176) 
   * 3b28446fbfe17d37baced7f46fd4b7373c1df539 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185) 
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r854081678


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -431,6 +431,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD

Review Comment:
   just copy from rewriteRecordWithMetadata, 
   but i donnot know why we need rewrite genericRecord,  it will cost some time.
   can we modfiy genericRecord directly ? just like genericRecord.put(HoodieRecord.FILENAME_METADATA_FIELD_POS, fileName);



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r854739036


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -431,6 +431,18 @@ public static GenericRecord rewriteRecordWithMetadata(GenericRecord genericRecor
     return newRecord;
   }
 
+  // TODO Unify the logical of rewriteRecordWithMetadata and rewriteEvolutionRecordWithMetadata, and delete this function.
+  public static GenericRecord rewriteEvolutionRecordWithMetadata(GenericRecord genericRecord, Schema newSchema, String fileName) {
+    GenericRecord newRecord = HoodieAvroUtils.rewriteRecordWithNewSchema(genericRecord, newSchema, new HashMap<>());
+    // do not preserve FILENAME_METADATA_FIELD

Review Comment:
   thanks.  After this operation, we will write parquet files directly, The life cycle of genericRecord has come to an end. I think we can try to turn these records into mutable in this place. Of course, let me try this in the next version.



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -98,6 +101,8 @@
   protected final String fileId;
   protected final String writeToken;
   protected final TaskContextSupplier taskContextSupplier;
+  // For full schema evolution
+  protected final boolean schemaOnReadEnable;

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1104740960

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 42dedcf6eaf051d131fc69619cc70e638e8bf5e6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105433325

   @nsivabalan  @alexeykudinkin  could you pls review again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105502224

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8174",
       "triggerID" : "a59523f9d730bb0b869927829faee66fdfb9491a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8176",
       "triggerID" : "9cc64da84133155264c4c56e4fe3341a82d6b915",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8185",
       "triggerID" : "3b28446fbfe17d37baced7f46fd4b7373c1df539",
       "triggerType" : "PUSH"
     }, {
       "hash" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8187",
       "triggerID" : "42dedcf6eaf051d131fc69619cc70e638e8bf5e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203",
       "triggerID" : "cbdc49842258501c1888e5cc247c70fadfde20b7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cbdc49842258501c1888e5cc247c70fadfde20b7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8203) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on code in PR #5376:
URL: https://github.com/apache/hudi/pull/5376#discussion_r855613640


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -741,10 +769,23 @@ private static Object rewriteRecordWithNewSchema(Object oldRecord, Schema oldSch
 
         for (int i = 0; i < fields.size(); i++) {
           Schema.Field field = fields.get(i);
+          String fieldName = field.name();
+          fieldNames.push(fieldName);
           if (oldSchema.getField(field.name()) != null) {
             Schema.Field oldField = oldSchema.getField(field.name());
-            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema()));
+            helper.put(i, rewriteRecordWithNewSchema(indexedRecord.get(oldField.pos()), oldField.schema(), fields.get(i).schema(), renameCols, fieldNames));

Review Comment:
   thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #5376: [HUDI-3921] Fixed schema evolution cannot work with HUDI-3855

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on PR #5376:
URL: https://github.com/apache/hudi/pull/5376#issuecomment-1105821432

   @xiarixiaoyao : please address the feedback in a follow up PR. I am going ahead and landing this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org