You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sandeep Nemuri (JIRA)" <ji...@apache.org> on 2016/11/08 09:46:59 UTC
[jira] [Created] (SPARK-18355) Spark SQL fails to read data from a
ORC hive table that has a new column added to it
Sandeep Nemuri created SPARK-18355:
--------------------------------------
Summary: Spark SQL fails to read data from a ORC hive table that has a new column added to it
Key: SPARK-18355
URL: https://issues.apache.org/jira/browse/SPARK-18355
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.6.2
Environment: Centos6
Reporter: Sandeep Nemuri
*PROBLEM*:
Spark SQL fails to read data from a ORC hive table that has a new column added to it.
Below is the exception:
{code}
scala> sqlContext.sql("select click_id,search_id from testorc").show
16/11/03 22:17:53 INFO ParseDriver: Parsing command: select click_id,search_id from testorc
16/11/03 22:17:54 INFO ParseDriver: Parse Completed
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
{code}
*STEPS TO SIMULATE THIS ISSUE*:
1) Create table in hive.
{code}
CREATE TABLE `testorc`(
`click_id` string,
`search_id` string,
`uid` bigint)
PARTITIONED BY (
`ts` string,
`hour` string)
STORED AS ORC;
{code}
2) Load data into table :
{code}
INSERT INTO TABLE testorc PARTITION (ts = '98765',hour = '01' ) VALUES (12,2,12345);
{code}
3) Select through spark shell (This works)
{code}
sqlContext.sql("select click_id,search_id from testorc").show
{code}
4) Now add column to hive table
{code}
ALTER TABLE testorc ADD COLUMNS (dummy string);
{code}
5) Now again select from spark shell
{code}
scala> sqlContext.sql("select click_id,search_id from testorc").show
16/11/03 22:17:53 INFO ParseDriver: Parsing command: select click_id,search_id from testorc
16/11/03 22:17:54 INFO ParseDriver: Parse Completed
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org