You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/05/07 05:51:00 UTC

[GitHub] [hudi] sbernauer edited a comment on pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

sbernauer edited a comment on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-834085407


   Hi together,
   
   we sadly can't do schema evolution for 10 months now (https://github.com/apache/hudi/issues/1845) and have to rely on ugly workarounds.
   Many thanks for working together to find a solution!
   We have tested this patch out in our test systems and everything worked fine. When we rolled it out to production we noticed that the Memory consumption increased by multiple times. This caused our executors to spill to disk and crash. We had to rollback to a previous version.
   So i would like to highlight the comment of @sathyaprakashg
   > @n3nash I am working on fixing build issue and will have that fix pushed soon. I would like to point out that with this new approach, we are stroing writer schema part of payload, which means, size of dataframe would increase to store same schema information with each record. Any suggestion on optimizing this?
   
   Regards,
   Sebastian


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org