You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/04 18:35:45 UTC

[GitHub] [hudi] n3nash commented on pull request #2334: [HUDI-1453] Throw Exception when input data schema is not equal to th…

n3nash commented on pull request #2334:
URL: https://github.com/apache/hudi/pull/2334#issuecomment-754142222


   @pengzhiwei2018 @nsivabalan Is there a different way to resolve this issue ? The writer/reader schema is baked into many parts of the code, even on the reader side, see example here -> https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeCompactedRecordReader.java#L70
   
   To be able to remove the concept of writer into input requires more changes in the code and also requires us to understand what does the readerSchema mean in this scenario. 
   
   I'm OK if you want to introduce another transient field called `tableSchema` and compare the writer (in your case inputSchema) against the tableSchema to throw the exception, but keep the concepts of writer/reader intact. If you want to propose changing them, I think putting a 1 pager about how writer/reader schema maps to your new suggested way (input/table) will help to understand how to reason about schemas going forward to avoid any regressions. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org