You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/09/17 04:50:55 UTC

[GitHub] [hudi] xiarixiaoyao edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

xiarixiaoyao edited a comment on pull request #3668:
URL: https://github.com/apache/hudi/pull/3668#issuecomment-921467046


   @codope  thanks for your review.   Let me answer some your questions first,pls forgive me for being busy today, I need to modify something for the pr of zorder
   **Let's add docs for all the public classes and APIs.**
   ok,i will added。
   **Does the merge schema action handle evolution of non-leaf fields in nested fields? For example, if a.b.c is renamed to a.d.c.**
   this is a strange demand, if you want to change a.b.c to a.d.c , why donot use  spark.sql(s"alter table ${tableName} rename column a.b to d"). we support handle all the non-leaf fields.
   
   **IIUC, the patch has not yet handled old or existing schema compatibility as mentioned in the RFC right?**
   already deal with this situation. pls see the TestSpark3DDL,  we first create  a table and insert some data to it. now no id-schema is produced. then we do schema change, and the id-schema is produced.  the test result confirmed this。  
   
   **Since schema history is being changed not only at the write time but also at the read time, so we need to think of both writer and reader concurrency.**
   A little doubtful, hudi is snapshot isolated, maybe only need to deal with concurrency between write and write, if i am wrong pls fix me , thanks
   
   
   Now there is the most important question, I now tend to use metatable to store historical schema。 as we know metatable use hfile to store data, the hfile has a very good point query performance, what do you suggest?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org