You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/05/22 15:01:04 UTC
[jira] [Resolved] (SPARK-20808) External Table unnecessarily not
created in Hive-compatible way
[ https://issues.apache.org/jira/browse/SPARK-20808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-20808.
-------------------------------
Resolution: Not A Problem
Fix Version/s: 2.2.0
> External Table unnecessarily not created in Hive-compatible way
> ---------------------------------------------------------------
>
> Key: SPARK-20808
> URL: https://issues.apache.org/jira/browse/SPARK-20808
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0, 2.1.1
> Reporter: Joachim Hereth
> Priority: Minor
> Fix For: 2.2.0
>
>
> In Spark 2.1.0 and 2.1.1 {{spark.catalog.createExternalTable}} creates tables unnecessarily in a hive-incompatible way.
> For instance executing in a spark shell
> {code}
> val database = "default"
> val table = "table_name"
> val path = "/user/daki/" + database + "/" + table
> var data = Array(("Alice", 23), ("Laura", 33), ("Peter", 54))
> val df = sc.parallelize(data).toDF("name","age")
> df.write.mode(org.apache.spark.sql.SaveMode.Overwrite).parquet(path)
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.catalog.createExternalTable(database + "."+ table, path)
> {code}
> issues the warning
> {code}
> Search Subject for Kerberos V5 INIT cred (<<DEF>>, sun.security.jgss.krb5.Krb5InitCredential)
> 17/05/19 11:01:17 WARN hive.HiveExternalCatalog: Could not persist `default`.`table_name` in a Hive compatible way. Persisting it into Hive metastore in Spark SQL specific format.
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:User daki does not have privileges for CREATETABLE)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720)
> ...
> {code}
> The Exception (user does not have privileges for CREATETABLE) is misleading (I do have the CREATE TABLE privilege).
> Querying the table with Hive does not return any result. With Spark one can access the data.
> The following code creates the table correctly (workaround):
> {code}
> def sqlStatement(df : org.apache.spark.sql.DataFrame, database : String, table: String, path: String) : String = {
> val rows = (for(col <- df.schema)
> yield "`" + col.name + "` " + col.dataType.simpleString).mkString(",\n")
> val sqlStmnt = ("CREATE EXTERNAL TABLE `%s`.`%s` (%s) " +
> "STORED AS PARQUET " +
> "Location 'hdfs://nameservice1%s'").format(database, table, rows, path)
> return sqlStmnt
> }
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.sql(sqlStatement(df, database, table, path))
> {code}
> The code is executed via YARN against a Cloudera CDH 5.7.5 cluster with Sentry enabled (in case this matters regarding the privilege warning). Spark was built against the CDH libraries.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org