You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/01 10:26:00 UTC

[GitHub] [iceberg] massdosage commented on a change in pull request #1837: Hive read path documentation for HadoopCatalog tables

massdosage commented on a change in pull request #1837:
URL: https://github.com/apache/iceberg/pull/1837#discussion_r533288696



##########
File path: site/docs/hive.md
##########
@@ -79,12 +79,45 @@ In order to query a Hive table created by either of the HiveCatalog methods desc
 ```sql
 SET iceberg.mr.catalog=hive;
 ```
-You should now be able to issue Hive SQL `SELECT` queries using the above table and see the results returned from the underlying Iceberg table. Both the Map Reduce and Tez query execution engines are supported.
+You should now be able to issue Hive SQL `SELECT` queries using the above table and see the results returned from the underlying Iceberg table.
 ```sql
 SELECT * from table_b;
 ```
 
+#### Using Hadoop Catalog
+Iceberg tables created using `HadoopCatalog` are stored entirely in a directory in a filesytem like HDFS. 
+
+##### Create an Iceberg table
+The first step is to create an Iceberg table using the Spark/Java/Python API and `HadoopCatalog`. For the purposes of this documentation we will assume that the fully qualified table identifier is `database_a.table_c` and that the Hadoop Catalog warehouse location is `hdfs://some_bucket/path_to_hadoop_warehouse`. Iceberg will therefore create the table at the location `hdfs://some_bucket/path_to_hadoop_warehouse/database_a/table_c`.
+
+##### Create a Hive table
+Now overlay a Hive table on top of this Iceberg table by issuing Hive DDL like so:
+```sql
+CREATE EXTERNAL TABLE database_a.table_c 
+STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
+LOCATION 'hdfs://some_bucket/path_to_hadoop_warehouse/database_a/table_c'
+TBLPROPERTIES (
+  'iceberg.mr.catalog'='hadoop', 

Review comment:
       I think so, I'm not sure why we don't set that when we automatically create those tables when the Hive exec engine is enabled. I'll make a note and take a look and will update the documentation accordingly (and perhaps also add the code to set that property).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org