You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/20 15:34:27 UTC

[GitHub] [incubator-hudi] bvaradar commented on issue #1512: [HUDI-763] Add hoodie.table.base.file.format option to hoodie.properties file

bvaradar commented on issue #1512:
URL: https://github.com/apache/incubator-hudi/pull/1512#issuecomment-616631319


   @lamber-ken : Storing per-partition specific metadata in hoodie.properties wont work as we are not versioning hoodie.properties. There is no atomicity guarantees across different cloud storages   for writers. 
   
   I think supporting different file formats within a table is not a priority but if we have to do, then we can instead store in .hoodie_partition_metadata first time we create it per partition. Hudi Hive Sync needs to then read this for each new partition getting added to hive to register the correct input format for that partition. I am not sure how Spark, Presto and Impala would work in this case. We need to evaluate before venturing out to supporting this.
   
   Coming back to the original objective of this PR, since hoodie.properties is effectively write-once, we can use it to store default file format of the hoodie table. As part of HoodieTableMetaClient.initTableType alone, we should set the file format of table once. TO support existing tables,  we can keep the default storage layout as "PARQUET" and if the setting is not present in hoodie.properties (for already created tables), we should use "PARQUET" as default.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org