You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "danny0405 (via GitHub)" <gi...@apache.org> on 2023/01/25 07:55:10 UTC

[GitHub] [hudi] danny0405 commented on pull request #7307: [HUDI-5271] fix issue inconsistent reader and writer schema in HoodieAvroDataBlock

danny0405 commented on PR #7307:
URL: https://github.com/apache/hudi/pull/7307#issuecomment-1403222206

   > Hi @alexeykudinkin
   > 
   > Before rebase, I have one thing want to check with you.
   > 
   > Last week, there was an issue about a similar exception about Avro schema namespace, #7691.
   > 
   > And @danny0405 mentioned in that ticket that it uses a constant namespace "record" in Flink side, [#7691 (comment)](https://github.com/apache/hudi/issues/7691#issuecomment-1386486937).
   > 
   > And in Spark side, I found we are using a namespace pattern `"namespace": "hoodie.test_mor_tab"` (`test_mor_tab` is Hudi table name) in writer schema, and a constant `"name": "Record"` in reader schema. [#7284 (comment)](https://github.com/apache/hudi/issues/7284#issuecomment-1324899843)
   > 
   > May I ask which one we should follow? Think we need to keep it consistent between Spark and Flink.
   
   If the namespace check is an Avro behavior and there is no way to work around, I'm afraid we must unify all the avro schema name spaces for read/writer schema then, does the `hoodie.table_name` namespace makes any sense here? How about we all use the constant `record` as the namespace name.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org