You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Laszlo Rigo (JIRA)" <ji...@apache.org> on 2019/02/05 09:29:00 UTC

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

    [ https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755846#comment-16755846 ] 

Laszlo Rigo edited comment on SPARK-26727 at 2/5/19 9:28 AM:
-------------------------------------------------------------

[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
 Name : hive-metastore Relocations: (not relocatable)
 Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
 Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
 Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: ec2-pkg-centos-6-1226.vpc.cloudera.com
 Group : System/Daemons Source RPM: hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
 Size : 5458 License: ASL 2.0
 Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
 URL : [http://hive.apache.org/]
 Summary : Shared metadata repository for Hive.
 Description :
 This optional package hosts a metadata server for Hive clients across a network to use.

 

# hive --version
Hive 1.1.0-cdh5.4.8
Subversion file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.4.8 -r Unknown
Compiled by jenkins on Fri Mar 25 09:38:39 PDT 2016
From source with checksum e4569745d1b7e0b2785263766d99cffc


was (Author: rigolaszlo):
[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
Name : hive-metastore Relocations: (not relocatable)
Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: ec2-pkg-centos-6-1226.vpc.cloudera.com
Group : System/Daemons Source RPM: hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
Size : 5458 License: ASL 2.0
Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
URL : http://hive.apache.org/
Summary : Shared metadata repository for Hive.
Description :
This optional package hosts a metadata server for Hive clients across a network to use.

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> -------------------------------------------------------------------
>
>                 Key: SPARK-26727
>                 URL: https://issues.apache.org/jira/browse/SPARK-26727
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Srinivas Yarra
>            Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW <view name> AS SELECT <columns> FROM <table>" fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or view '<view name>' already exists in database 'default'; at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314) at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:195) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res5: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res6: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res7: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res8: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res9: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res10: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") res11: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM ae_dual") org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or view 'testsparkreplace' already exists in database 'default'; at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:246) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:236) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:319) at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:195) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided scala> spark.catalog.tableExists("testSparkReplace") res13: Boolean = false 
> scala>
> {code}
>  
> As we can see, it does not happen for the first 11 executions and then the exception is thrown  and then the table does not exist, so it is dropped, but not created again



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org