You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/24 02:50:15 UTC

[GitHub] [iceberg] lcaaaat commented on issue #954: Default warehouse location of a table should be a subdirectory in database location

lcaaaat commented on issue #954:
URL: https://github.com/apache/iceberg/issues/954#issuecomment-663324572


   @pvary  Thanks for your reply!
   
   For some reason, I don't want to write data into **/user/hive/warehouse**, so, I create the database by sql:
   
   ```scala
   spark.sql("create database iceberg_test location '/user/data_transform/iceberg_test'")
   ```
   
   As the result, we can see that:
   
   ```scala
   spark.sql("desc database iceberg_test").show(false)
   +-------------------------+--------------------------------------------+
   |database_description_item|database_description_value                  |
   +-------------------------+--------------------------------------------+
   |Database Name            |iceberg_test                                |
   |Description              |                                            |
   |Location                 |hdfs://dev4/user/data_transform/iceberg_test|
   +-------------------------+--------------------------------------------+
   ```
   
   Then I create a table in hive format and describe it: 
   
   ```scala
   spark.sql("create table iceberg_test.hive_table(name string)")
   
   spark.sql("describe FORMATTED iceberg_test.hive_table").show(false)
   +----------------------------+----------------------------------------------------------+-------+
   |col_name                    |data_type                                                 |comment|
   +----------------------------+----------------------------------------------------------+-------+
   |name                        |string                                                    |null   |
   |                            |                                                          |       |
   |# Detailed Table Information|                                                          |       |
   |Database                    |iceberg_test                                              |       |
   |Table                       |hive_table                                                |       |
   |Owner                       |data_transform/dev@TEST.HZ.NETEASE.COM                    |       |
   |Created Time                |Fri Jul 24 10:24:24 CST 2020                              |       |
   |Last Access                 |Thu Jan 01 08:00:00 CST 1970                              |       |
   |Created By                  |Spark 2.4.5                                               |       |
   |Type                        |MANAGED                                                   |       |
   |Provider                    |hive                                                      |       |
   |Table Properties            |[transient_lastDdlTime=1595557464]                        |       |
   |Location                    |hdfs://dev4/user/data_transform/iceberg_test/hive_table   |       |
   |Serde Library               |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe        |       |
   |InputFormat                 |org.apache.hadoop.mapred.TextInputFormat                  |       |
   |OutputFormat                |org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat|       |
   |Storage Properties          |[serialization.format=1]                                  |       |
   |Partition Provider          |Catalog                                                   |       |
   +----------------------------+----------------------------------------------------------+-------+
   ```
   
   We can see that the location of table is **hdfs://dev4/user/data_transform/iceberg_test/hive_table**, a subdirectory under **iceberg_test**'s location.
   
   However, if I create a table in iceberg format and print it's location:
   
   ```scala
   val hiveCatalog = HiveCatalogs.loadCatalog(new HiveConf())
   val tableIdentifier = TableIdentifier.parse("iceberg_test.iceberg_table")
   val schema = new Schema(NestedField.optional(2, "name", StringType.get()))
   val table = hiveCatalog.createTable(tableIdentifier, schema)
   println(table.location())
   
   /user/warehouse/iceberg_test.db/iceberg_table
   ```
   
   We can find there is a difference about the tables' location between two format.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org