You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/12 22:59:02 UTC

[GitHub] [iceberg] aokolnychyi opened a new issue #1590: Spark SQL Extensions: SNAPSHOT command

aokolnychyi opened a new issue #1590:
URL: https://github.com/apache/iceberg/issues/1590


   If you have an existing Spark or Hive table that complies with the standard Hive table format, you should be able to use the SNAPSHOT command to safely test out your workloads on top of Iceberg without affecting the original table. The SNAPSHOT command should use the definition (including the schema and partition spec) of the original table to create a new Iceberg table that contains metadata for files currently present in the original table.
   
   The SNAPSHOT command should accept an optional location for the new Iceberg table. In addition, the command must validate that the Iceberg table location, as well as the data and metadata locations, are different from the original Hive table location to ensure that the Hive table will be unaffected.
   
   You should be able to read and write to the created Iceberg table. New files will be written to the isolated location. Subsequent changes to the original Hive table will not be propagated to Iceberg.
   
   ```
   SNAPSHOT TABLE source AS target
   USING iceberg
   [LOCATION 'iceberg_table_location']
   [TBLPROPERTIES ('key' 'value')]
   ```
   
   Right now, SNAPSHOT can be limited to generating metadata for existing supported file formats (e.g. Avro, Parquet, ORC). In the future, we can also consider rewriting some unsupported file formats like CSV or JSON. Source tables can be Iceberg tables too. For example, someone may want to snapshot a prod table and use it for testing.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #1590: Spark SQL Extensions: SNAPSHOT command

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #1590:
URL: https://github.com/apache/iceberg/issues/1590#issuecomment-707386346


   @RussellSpitzer is working on this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed issue #1590: Spark SQL Extensions: SNAPSHOT command

Posted by GitBox <gi...@apache.org>.
aokolnychyi closed issue #1590:
URL: https://github.com/apache/iceberg/issues/1590


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #1590: Spark SQL Extensions: SNAPSHOT command

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #1590:
URL: https://github.com/apache/iceberg/issues/1590#issuecomment-754620092


   This was done in PR #1906.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org