You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "dongxu (JIRA)" <ji...@apache.org> on 2015/04/01 07:15:53 UTC
[jira] [Created] (SPARK-6644) [SPARK-SQL]when the partition schema
does not match table schema(ADD COLUMN), new column is NULL
dongxu created SPARK-6644:
-----------------------------
Summary: [SPARK-SQL]when the partition schema does not match table schema(ADD COLUMN), new column is NULL
Key: SPARK-6644
URL: https://issues.apache.org/jira/browse/SPARK-6644
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.3.0
Reporter: dongxu
In hive,the schema of partition may be difference from the table schema. For example, we add new column. When we use spark-sql to query the data of partition which schema is difference from the table schema.
some problems is solved(https://github.com/apache/spark/pull/4289),
but if you add a new column,put new data into the old partition,new column value is NULL
[According to the following steps]:
case class TestData(key: Int, value: String)
val testData = TestHive.sparkContext.parallelize(
(1 to 10).map(i => TestData(i, i.toString))).toDF()
testData.registerTempTable("testData")
sql("DROP TABLE IF EXISTS table_with_partition ")
sql(s"CREATE TABLE IF NOT EXISTS table_with_partition(key int,value string) PARTITIONED by (ds string) location '${tmpDir.toURI.toString}' ")
sql("INSERT OVERWRITE TABLE table_with_partition partition (ds='1') SELECT key,value FROM testData")
// add column to table
sql("ALTER TABLE table_with_partition ADD COLUMNS(key1 string)")
sql("ALTER TABLE table_with_partition ADD COLUMNS(destlng double)")
sql("INSERT OVERWRITE TABLE table_with_partition partition (ds='1') SELECT key,value,'test',1.11 FROM testData")
sql("select * from table_with_partition where ds='1' ").collect().foreach(println)
result :
[1,1,null,null,1]
[2,2,null,null,1]
result we expect:
[1,1,test,1.11,1]
[2,2,test,1.11,1]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org