You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/14 18:33:26 UTC

[GitHub] [iceberg] itachi-sharingan commented on issue #2488: Forbid setting "uri" for SparkSessionCatalog when Hive is used.

itachi-sharingan commented on issue #2488:
URL: https://github.com/apache/iceberg/issues/2488#issuecomment-919412766


   There is a weird situation here, I think I know where to put the check but not sure I understand the problem correctly. Please bear with me:-
   Solution:
   Put a check in SparkSessionCatalog initialize function -> if option map contains key "type" as hive then there should not be a "uri" key in option.
   
   Problem(as far as I understand):-
   
   - if we configure two different thrift servers in Hadoop conf(i.e spark.hadoop.hive.metastore.uris) and iceberg cons(i.e spark.sql.catalog.spark_catalog.uri) then when try to create database the first one gets used and if try to create table then the second one gets used.
   
   Things I think I know:-
   
   - The uri configured in iceberg config(i.e spark.sql.catalog.spark_catalog.uri) always overrides the one in hadoop config(i.e spark.hadoop.hive.metastore.uris).
   - spark has a default catalog which acts as backup option(delegate catalog) if a particular table or database in spark sql ddl cannot be found in any new configured catalog metastore (given we have not configured spark_catalog itself).
   
   Things I am confused about:-
   - by configuring this - spark.sql.catalog.spark_catalog.uri=thrift://localhost:9083 and spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog, are we actually overriding the default catalog for spark. So now there is no backup catalog.
   - are the namespace functions listed in SparkSessionCatalog class for databases, I am assuming in database = namespace, is this wrong? If yes, what is the difference between the two?
   - why are "create database" and "create table" queries going to two different thrift servers if hadoop conf and iceberg conf are different or only spark_catalog.uri is configured?
   
   @rdblue @RussellSpitzer can you guys pls help?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org