You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tavis Barr (JIRA)" <ji...@apache.org> on 2018/04/23 17:58:00 UTC

[jira] [Commented] (SPARK-18112) Spark2.x does not support read data from Hive 2.x metastore

    [ https://issues.apache.org/jira/browse/SPARK-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448587#comment-16448587 ] 

Tavis Barr commented on SPARK-18112:
------------------------------------

It looks to me like this issue has actually not been fixed.  As seen in the stack trace, the offending code is in 

/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala

line 205, where the method attempts to fetch the value of configuration parameter HIVE_STATS_JDBC_TIMEOUT

which was a configuration parameter defined in import org.apache.hadoop.hive.conf.HiveConf, which is part of hive-common.  However, this configuration parameter was removed in Hive 2, therefore the above code will throw an exception when run with hive-common versions 2.x.  It is possible there are other configuration parameters requested in HiveUtils.scala that have been removed as well; I haven't checked.  In any event, the above line 205 is still present in the Master branch as of today, so Spark still does not work with Hive 2.x.

> Spark2.x does not support read data from Hive 2.x metastore
> -----------------------------------------------------------
>
>                 Key: SPARK-18112
>                 URL: https://issues.apache.org/jira/browse/SPARK-18112
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1
>            Reporter: KaiXu
>            Assignee: Xiao Li
>            Priority: Critical
>             Fix For: 2.2.0
>
>
> Hive2.0 has been released in February 2016, after that Hive2.0.1 and Hive2.1.0 have also been released for a long time, but till now spark only support to read hive metastore data from Hive1.2.1 and older version, since Hive2.x has many bugs fixed and performance improvement it's better and urgent to upgrade to support Hive2.x
> failed to load data from hive2.x metastore:
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:197)
>         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
>         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:4
>         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
>         at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
>         at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
>         at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:31)
>         at org.apache.spark.sql.SparkSession.table(SparkSession.scala:568)
>         at org.apache.spark.sql.SparkSession.table(SparkSession.scala:564)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org