You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/05/08 17:00:46 UTC

[GitHub] [hudi] nsivabalan edited a comment on pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

nsivabalan edited a comment on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-835429596


   yes, thanks for clarifying. I guess, embedding schema in every payload might be detrimental as you have experienced. So, have thought of a diff approach to regenerate records w/ new schema at spark datasource layer. Only the batch that is getting ingested w/ old schema after table's schema got evolved will take a hit with this conversion. 
   
   https://github.com/apache/hudi/pull/2927
   
   Also, as I have mentioned earlier, if others (@n3nash , @bvaradar ) confirm that schema post processor is not required as a mandatory step with this [fix](https://github.com/apache/hudi/pull/2765) for default vals, we don't need any changes in delta streamer as such, just https://github.com/apache/hudi/pull/2927 would suffice. 
   
   @n3nash is doing more testing around this as well. So, will wait for him to comment on the patch as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org