You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "David Winters (JIRA)" <ji...@apache.org> on 2016/07/29 15:09:20 UTC
[jira] [Commented] (SPARK-9761) Inconsistent metadata handling with ALTER TABLE

    [ https://issues.apache.org/jira/browse/SPARK-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399468#comment-15399468 ] 

David Winters commented on SPARK-9761:
--------------------------------------

Hi [~xwu0226] and [~simeons],

Is there any plan to resolve this issue anytime soon?  I see that this bug hasn't been assigned to anyone and there is no fix version set.  I'm seeing this same behavior and I also get an exception when attempting to append to the altered table.  See below...

{noformat}
java.lang.RuntimeException: Relation[snip... snip...] AvroRelation[file:/snip...  snip...]
 requires that the query in the SELECT clause of the INSERT INTO/OVERWRITE statement generates the same number of columns as its schema.
	at scala.sys.package$.error(package.scala:27)
	at org.apache.spark.sql.execution.datasources.PreInsertCastAndRename$$anonfun$apply$1.applyOrElse(rules.scala:44)
	at org.apache.spark.sql.execution.datasources.PreInsertCastAndRename$$anonfun$apply$1.applyOrElse(rules.scala:34)
	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227)
	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:226)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:217)
	at org.apache.spark.sql.execution.datasources.PreInsertCastAndRename$.apply(rules.scala:34)
	at org.apache.spark.sql.execution.datasources.PreInsertCastAndRename$.apply(rules.scala:33)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:83)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:80)
	at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
	at scala.collection.immutable.List.foldLeft(List.scala:84)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:80)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:72)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72)
	at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:916)
	at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:916)
	at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914)
	at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:918)
	at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:917)
	at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:921)
	at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:921)
	at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:926)
	at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:924)
	at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:930)
	at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:930)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
	at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:187)
	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:237)
	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:219)
{noformat}


> Inconsistent metadata handling with ALTER TABLE
> -----------------------------------------------
>
>                 Key: SPARK-9761
>                 URL: https://issues.apache.org/jira/browse/SPARK-9761
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1
>         Environment: Ubuntu on AWS
>            Reporter: Simeon Simeonov
>              Labels: hive, sql
>
> Schema changes made with {{ALTER TABLE}} are not shown in {{DESCRIBE TABLE}}. The table in question was created with {{HiveContext.read.json()}}.
> Steps:
> # {{alter table dimension_components add columns (z string);}} succeeds.
> # {{describe dimension_components;}} does not show the new column, even after restarting spark-sql.
> # A second {{alter table dimension_components add columns (z string);}} fails with RROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: Duplicate column name: z
> Full spark-sql output [here|https://gist.github.com/ssimeonov/d9af4b8bb76b9d7befde].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org