You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pratik Malani (Jira)" <ji...@apache.org> on 2022/10/12 12:53:00 UTC
[jira] [Commented] (SPARK-40736) Spark 3.3.0 doesn't works with Hive 3.1.2

    [ https://issues.apache.org/jira/browse/SPARK-40736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616399#comment-17616399 ] 

Pratik Malani commented on SPARK-40736:
---------------------------------------

Hi All,

Removed hive-service jar from the classpath.

Now Spark Thriftserver has started but facing another issue while querying the database using the thriftserver.
{noformat}
java.lang.NullPointerException
        at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:809)
        at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:702)
        at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:650)
        at org.apache.hadoop.hive.metastore.ObjectStore.unCacheDataNucleusClassLoaders(ObjectStore.java:9708)
        at org.apache.hadoop.hive.ql.session.SessionState.unCacheDataNucleusClassLoaders(SessionState.java:1802)
        at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1777)
        at org.apache.hive.service.cli.session.HiveSessionImpl.close(HiveSessionImpl.java:669)
        at org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:295)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLSessionManager.closeSession(SparkSQLSessionManager.scala:91)
        at org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:238)
        at org.apache.hive.service.cli.thrift.ThriftCLIService$1.deleteContext(ThriftCLIService.java:107)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:325)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Exception in thread "HiveServer2-Handler-Pool: Thread-68" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.QueryState.<init>(Lorg/apache/hadoop/hive/conf/HiveConf;Ljava/util/Map;Z)V
        at org.apache.hive.service.cli.operation.Operation.<init>(Operation.java:89)
        at org.apache.hive.service.cli.operation.ExecuteStatementOperation.<init>(ExecuteStatementOperation.java:34)
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.<init>(SparkExecuteStatementOperation.scala:50)
        at org.apache.spark.sql.hive.thriftserver.server.SparkSQLOperationManager.newExecuteStatementOperation(SparkSQLOperationManager.scala:55)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:481)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:472)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:455)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750){noformat}
After deep investigation, found out Spark 3.3.0 is creating the QueryState using constructor

[https://jar-download.com/artifacts/org.apache.spark/spark-hive-thriftserver_2.12/3.3.0/source-code/org/apache/hive/service/cli/operation/Operation.java]

!image-2022-10-12-18-19-24-455.png|width=581,height=183!

This constructor initialization has been removed from the latest hive QueryState.java file

[https://jar-download.com/artifacts/org.apache.hive/hive-exec/3.1.2/source-code/org/apache/hadoop/hive/ql/QueryState.java]

Hive is using Builder pattern while creating the QueryState object. 
Can Spark help to resolve the code to be compatible with Spark 3.1.2?

 

> Spark 3.3.0 doesn't works with Hive 3.1.2
> -----------------------------------------
>
>                 Key: SPARK-40736
>                 URL: https://issues.apache.org/jira/browse/SPARK-40736
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 3.3.0
>            Reporter: Pratik Malani
>            Priority: Major
>              Labels: Hive, spark
>         Attachments: image-2022-10-12-18-19-24-455.png
>
>
> Hive 2.3.9 is impacted with CVE-2021-34538, so trying to use the Hive 3.1.2.
> Using Spark 3.3.0 with Hadoop 3.3.4 and Hive 3.1.2, getting below error when starting the Thriftserver
>  
> {noformat}
> Exception in thread "main" java.lang.IllegalAccessError: tried to access class org.apache.hive.service.server.HiveServer2$ServerOptionsProcessor from class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$
>         at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:92)
>         at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>         at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
>         at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala){noformat}
> Using below command to start the Thriftserver
>  
> *spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal*
>  
> Have set the SPARK_HOME correctly.
>  
> The same works well with Hive 2.3.9, but fails when we upgrade to Hive 3.1.2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org