You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Vadim Panov (Jira)" <ji...@apache.org> on 2019/09/04 11:08:00 UTC
[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

    [ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922382#comment-16922382 ] 

Vadim Panov commented on SPARK-13446:
-------------------------------------

Did some research on the issue:
 * The offending field (HIVE_STATS_JDBC_TIMEOUT) is still present in Spark v. 2.4.4  [https://github.com/apache/spark/blob/branch-2.4/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala]
 * ...but not in the master branch anymore (as of 1 Sept 2019)  [https://github.com/apache/spark/blob/d5688dc732890923c326f272b0c18c329a69459a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala]

So, I'm hoping this fix will get into the next Spark release (2.4.5? 3.0? don't know).

> Spark need to support reading data from Hive 2.0.0 metastore
> ------------------------------------------------------------
>
>                 Key: SPARK-13446
>                 URL: https://issues.apache.org/jira/browse/SPARK-13446
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Lifeng Wang
>            Assignee: Xiao Li
>            Priority: Major
>             Fix For: 2.2.0
>
>
> Spark provided HIveContext class to read data from hive metastore directly. While it only supports hive 1.2.1 version and older. Since hive 2.0.0 has released, it's better to upgrade to support Hive 2.0.0.
> {noformat}
> 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI thrift://hsw-node13:9083
> 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current connections: 1
> 16/02/23 02:35:02 INFO metastore: Connected to metastore.
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>         at org.apache.spark.sql.hive.HiveContext$$anon$1.<init>(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org