You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/10/11 03:55:00 UTC

[jira] [Assigned] (SPARK-18355) Spark SQL fails to read data from a ORC hive table that has a new column added to it

     [ https://issues.apache.org/jira/browse/SPARK-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-18355:
------------------------------------

    Assignee: Apache Spark

> Spark SQL fails to read data from a ORC hive table that has a new column added to it
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-18355
>                 URL: https://issues.apache.org/jira/browse/SPARK-18355
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.2, 2.1.0, 2.2.0
>         Environment: Centos6
>            Reporter: Sandeep Nemuri
>            Assignee: Apache Spark
>
> *PROBLEM*:
> Spark SQL fails to read data from a ORC hive table that has a new column added to it.
> Below is the exception:
> {code}
> scala> sqlContext.sql("select click_id,search_id from testorc").show
> 16/11/03 22:17:53 INFO ParseDriver: Parsing command: select click_id,search_id from testorc
> 16/11/03 22:17:54 INFO ParseDriver: Parse Completed
> java.lang.AssertionError: assertion failed
> 	at scala.Predef$.assert(Predef.scala:165)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38)
> 	at scala.Option.map(Option.scala:145)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
> 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> {code}
> *STEPS TO SIMULATE THIS ISSUE*:
> 1) Create table in hive.
> {code}
> CREATE TABLE `testorc`( 
> `click_id` string, 
> `search_id` string, 
> `uid` bigint)
> PARTITIONED BY ( 
> `ts` string, 
> `hour` string) 
> STORED AS ORC; 
> {code}
> 2) Load data into table :
> {code}
> INSERT INTO TABLE testorc PARTITION (ts = '98765',hour = '01' ) VALUES (12,2,12345);
> {code}
> 3) Select through spark shell (This works)
> {code}
> sqlContext.sql("select click_id,search_id from testorc").show
> {code}
> 4) Now add column to hive table
> {code}
> ALTER TABLE testorc ADD COLUMNS (dummy string);
> {code}
> 5) Now again select from spark shell
> {code}
> scala> sqlContext.sql("select click_id,search_id from testorc").show
> 16/11/03 22:17:53 INFO ParseDriver: Parsing command: select click_id,search_id from testorc
> 16/11/03 22:17:54 INFO ParseDriver: Parse Completed
> java.lang.AssertionError: assertion failed
> 	at scala.Predef$.assert(Predef.scala:165)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38)
> 	at scala.Option.map(Option.scala:145)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
> 	at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647)
> 	at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
> 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> 	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> 	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org