You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/05/22 15:01:04 UTC

[jira] [Resolved] (SPARK-20808) External Table unnecessarily not created in Hive-compatible way

     [ https://issues.apache.org/jira/browse/SPARK-20808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-20808.
-------------------------------
       Resolution: Not A Problem
    Fix Version/s: 2.2.0

> External Table unnecessarily not created in Hive-compatible way
> ---------------------------------------------------------------
>
>                 Key: SPARK-20808
>                 URL: https://issues.apache.org/jira/browse/SPARK-20808
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0, 2.1.1
>            Reporter: Joachim Hereth
>            Priority: Minor
>             Fix For: 2.2.0
>
>
> In Spark 2.1.0 and 2.1.1 {{spark.catalog.createExternalTable}} creates tables unnecessarily in a hive-incompatible way.
> For instance executing in a spark shell
> {code}
> val database = "default"
> val table = "table_name"
> val path = "/user/daki/"  + database + "/" + table
> var data = Array(("Alice", 23), ("Laura", 33), ("Peter", 54))
> val df = sc.parallelize(data).toDF("name","age") 
> df.write.mode(org.apache.spark.sql.SaveMode.Overwrite).parquet(path)
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.catalog.createExternalTable(database + "."+ table, path)
> {code}
> issues the warning
> {code}
> Search Subject for Kerberos V5 INIT cred (<<DEF>>, sun.security.jgss.krb5.Krb5InitCredential)
> 17/05/19 11:01:17 WARN hive.HiveExternalCatalog: Could not persist `default`.`table_name` in a Hive compatible way. Persisting it into Hive metastore in Spark SQL specific format.
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:User daki does not have privileges for CREATETABLE)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720)
> ...
> {code}
> The Exception (user does not have privileges for CREATETABLE) is misleading (I do have the CREATE TABLE privilege).
> Querying the table with Hive does not return any result. With Spark one can access the data.
> The following code creates the table correctly (workaround):
> {code}
> def sqlStatement(df : org.apache.spark.sql.DataFrame, database : String, table: String, path: String) : String = {
>   val rows = (for(col <- df.schema) 
>                     yield "`" + col.name + "` " + col.dataType.simpleString).mkString(",\n")
>   val sqlStmnt = ("CREATE EXTERNAL TABLE `%s`.`%s` (%s) " +
>     "STORED AS PARQUET " +
>     "Location 'hdfs://nameservice1%s'").format(database, table, rows, path)
>   return sqlStmnt
> }
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.sql(sqlStatement(df, database, table, path))
> {code}
> The code is executed via YARN against a Cloudera CDH 5.7.5 cluster with Sentry enabled (in case this matters regarding the privilege warning). Spark was built against the CDH libraries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org