You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/03 08:47:25 UTC

[GitHub] [iceberg] jackye1995 commented on pull request #1843: Support for file paths in SparkCatalogs via HadoopTables

jackye1995 commented on pull request #1843:
URL: https://github.com/apache/iceberg/pull/1843#issuecomment-737756408


   Sorry I think I am pretty late in the whole conversation, completely forgot to follow up on #1783 after my initial comment, still trying to catch up...
   
   I also like the idea of a custom `PathIdentifier`, but it seems to only work for loading an existing table through `SupportsCatalogOptions`, and as @rymurr says I don't think it handles those `TableCatalog` operations well. The Spark plan would directly use `Identifier.of(...)` which uses its default `IdentifierImpl`.
   
   For these DDL, I actually like the approach that Anton mentioned in #1306 about wrapping `HadoopTables` into a catalog, for example `HadoopPathCatalog`, so that we can run something like `CREATE TABLE catalog_name."path/to/table" USING iceberg`. I don't think using `LOCATION` keyword is an option for us, not because of the dummy table name, but because not all ALTER and DELETE statements support it natively in open source Spark. That proposal was the last conversation of the issue and never got follow up, was there any concern with that approach not mentioned there?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org