You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "JP Bordenave (Jira)" <ji...@apache.org> on 2019/10/01 19:58:00 UTC

[jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

    [ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942275#comment-16942275 ] 

JP Bordenave edited comment on SPARK-13446 at 10/1/19 7:57 PM:
---------------------------------------------------------------

1) ok i am back,  after some internet issue,  i make restore all hive 1.2.1  jars from spark 2.4.4

2) hive v2.3.3 get conflict mysql schema 2.3.0 with spark 2.4.4 because it use 1.2.1 schema

3) i make cp hive-site.xml into spark/conf and i disable schema check,  it is working fine under spark-shell

{{<property>}}

{\{ <name>hive.metastore.schema.verification</name>}}

{\{ <value>false</value>}}

{{</property>}}

{{But i doenst understand why spark 2.4.4 use old hive schema 1.2.1 (not realy clear for me)}}

{{}}
{noformat}
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> select * from employee;
<console>:24: error: not found: value select
       select * from employee;
       ^
<console>:24: error: not found: value from
       select * from employee;
                ^scala> spark.sql("show databases").show
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> spark.sql("show tables").show
19/10/01 21:44:07 WARN metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default| employee|      false|
+--------+---------+-----------+
scala> spark.sql("select * from employee").show
+---+-----------+---------+
| id|       name|     dept|
+---+-----------+---------+
|  1|      Allen|       IT|
|  2|        Mag|    Sales|
| 14|     Pierre|      xXx|
|  1|      Allen|       IT|
|  3|        Rob|    Sales|
|  4|       Dana|       IT|
|  7|     Pierre|      xXx|
| 11|     Pierre|      xXx|
| 10|     Pierre|      xXx|
| 12|     Pierre|      xXx|
| 13|     Pierre|      xXx|
+---+-----------+---------+
 {noformat}
{{}}

{{}}

{{}}


was (Author: jpbordi):
1) ok i am back,  after some internet issue,  i make restore all hive 1.2.1  jars from spark 2.4.4

2) hive v2.3.3 get conflict mysql schema 2.3.0 with spark 2.4.4 because it use 1.2.1 schema

3) i make cp hive-site.xml into spark/conf and i disable schema check,  it is working fine

{{<property>}}

{{    <name>hive.metastore.schema.verification</name>}}

{{    <value>false</value>}}

{{</property>}}

{{But i doenst understand why spark 2.4.4 use old hive schema 1.2.1 (not realy clear for me)}}

{{}}
{noformat}
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> select * from employee;
<console>:24: error: not found: value select
       select * from employee;
       ^
<console>:24: error: not found: value from
       select * from employee;
                ^scala> spark.sql("show databases").show
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> spark.sql("show tables").show
19/10/01 21:44:07 WARN metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default| employee|      false|
+--------+---------+-----------+
scala> spark.sql("select * from employee").show
+---+-----------+---------+
| id|       name|     dept|
+---+-----------+---------+
|  1|      Allen|       IT|
|  2|        Mag|    Sales|
| 14|     Pierre|      xXx|
|  1|      Allen|       IT|
|  3|        Rob|    Sales|
|  4|       Dana|       IT|
|  6|Jean-Pierre|Bordenave|
|  7|     Pierre|      xXx|
| 11|     Pierre|      xXx|
| 10|     Pierre|      xXx|
| 12|     Pierre|      xXx|
| 13|     Pierre|      xXx|
+---+-----------+---------+
 {noformat}
{{}}

{{}}

{{}}

> Spark need to support reading data from Hive 2.0.0 metastore
> ------------------------------------------------------------
>
>                 Key: SPARK-13446
>                 URL: https://issues.apache.org/jira/browse/SPARK-13446
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Lifeng Wang
>            Assignee: Xiao Li
>            Priority: Major
>             Fix For: 2.2.0
>
>
> Spark provided HIveContext class to read data from hive metastore directly. While it only supports hive 1.2.1 version and older. Since hive 2.0.0 has released, it's better to upgrade to support Hive 2.0.0.
> {noformat}
> 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI thrift://hsw-node13:9083
> 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current connections: 1
> 16/02/23 02:35:02 INFO metastore: Connected to metastore.
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>         at org.apache.spark.sql.hive.HiveContext$$anon$1.<init>(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org