You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "suheng.cloud (Jira)" <ji...@apache.org> on 2022/03/24 08:21:00 UTC

[jira] [Created] (SPARK-38642) spark-sql can not enable isolatedClientLoader to extend dsv2 catalog when using builtin hiveMetastoreJar

suheng.cloud created SPARK-38642:
------------------------------------

             Summary: spark-sql can not enable isolatedClientLoader to extend dsv2 catalog when using builtin hiveMetastoreJar
                 Key: SPARK-38642
                 URL: https://issues.apache.org/jira/browse/SPARK-38642
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.2.1, 3.1.2
            Reporter: suheng.cloud


Hi, all:

I make use of IsolatedClientLoader to enable datasource v2 catalog on hive, It works well on api/spark-shell, while failed on spark-sql cmd.

After dig into source, I found that the SparkSQLCLIDriver(spark-sql) initialize differently by using CliSessionState which will be reused through cli lifecycle.

Thus the IsolatedClientLoader creator in HiveUtils will determine to off isolate because encoutering special global SessionState by that type.In my case, namespaces/tables will not recognized from another hive catalog since a CliSessionState in sparkSession will always be used to connected with.

I notice [SPARK-21428|https://issues.apache.org/jira/browse/SPARK-21428] but think that since the datasource v2 api should be more popular, SparkSQLCLIDriver should also adjust that?

my env:

spark-3.1.2
hadoop-cdh5.13.0
hive-2.3.6
for each v2 catalog set spark.sql.hive.metastore.jars=builtin(we have no auth to deploy jars on target clusters)

Now, for workaround this, we have to deploy jars on hdfs and use 'path' way which cause a significant delay on catalog initialize.

Any help is appreciate, thanks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org