You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/13 13:59:48 UTC

[GitHub] [iceberg] rymurr commented on issue #2068: Procedure for adding files to a Table

rymurr commented on issue #2068:
URL: https://github.com/apache/iceberg/issues/2068#issuecomment-759466345


   I agree with @electrum that a huge strength of iceberg is its strong specification, I would be reluctant to do anything that could weaken that or give users a footgun wrt to metadata strength. Likewise I agree that reading the files is probably required to do this safely.
   
   However, we are currently working on our own version of the 'make these files an iceberg table' function. It sounds like several of these already exist in other places. From the comments this is being driven by the desire to avoid copying files around, especially on S3. Our use case is the same and the conversion to iceberg will be done primarily by users (as opposed to a super-user). 
   
   I would be interested in helping to derive/implement a spec that defines the canonical Iceberg approach to importing a set of files without moving/copying/renaming (I guess this is similar to the MIGRATE Spark action?). At the very least this is likely going to have to read the footers and to verify the schema and partition spec. I see a lot of value in this function working (at least partially) across engines so that all users/systems can take advantage of it.
   
   cc @vvellanki


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org