You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Piotr Milanowski (JIRA)" <ji...@apache.org> on 2016/06/27 07:32:52 UTC
[jira] [Created] (SPARK-16224) Hive context created by HiveContext
can't access Hive databases when used in a script launched be spark-submit
Piotr Milanowski created SPARK-16224:
----------------------------------------
Summary: Hive context created by HiveContext can't access Hive databases when used in a script launched be spark-submit
Key: SPARK-16224
URL: https://issues.apache.org/jira/browse/SPARK-16224
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 2.0.0
Environment: branch-2.0
Reporter: Piotr Milanowski
Hi,
This is a continuation of a resolved bug [SPARK-15345|https://issues.apache.org/jira/browse/SPARK-15345]
I can access databases when using new methodology, i.e:
{code}
from pyspark.sql import SparkSession
from pyspark import SparkConf
if __name__ == "__main__":
conf = SparkConf()
hc = SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()
print(hc.sql("show databases").collect())
{code}
This shows all database in hive.
However, using HiveContext, i.e.:
{code}
from pyspark.sql import HiveContext
from pyspark improt SparkContext, SparkConf
if __name__ == "__main__":
conf = SparkConf()
sc = SparkContext(conf=conf)
hive_context = HiveContext(sc)
print(hive_context.sql("show databases").collect())
# The result is
#[Row(result='default')]
{code}
prints only default database.
I have {{hive-site.xml}} file configured.
Those snippets are for scripts launched with {{spark-submit}} command. With pyspark those code fragments work fine, displaying all the databases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org