You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "dongxu (JIRA)" <ji...@apache.org> on 2015/04/01 07:15:53 UTC

[jira] [Created] (SPARK-6644) [SPARK-SQL]when the partition schema does not match table schema(ADD COLUMN), new column is NULL

dongxu created SPARK-6644:
-----------------------------

             Summary: [SPARK-SQL]when the partition schema does not match table schema(ADD COLUMN), new column is NULL
                 Key: SPARK-6644
                 URL: https://issues.apache.org/jira/browse/SPARK-6644
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.3.0
            Reporter: dongxu


In hive,the schema of partition may be difference from the table schema. For example, we add new column. When we use spark-sql to query the data of partition which schema is difference from the table schema.
some problems is solved(https://github.com/apache/spark/pull/4289), 
but if you add a new column,put new data into the old partition,new column value is NULL

[According to the following steps]:

case class TestData(key: Int, value: String)
val testData = TestHive.sparkContext.parallelize(
      (1 to 10).map(i => TestData(i, i.toString))).toDF()
testData.registerTempTable("testData")
 sql("DROP TABLE IF EXISTS table_with_partition ")
 sql(s"CREATE  TABLE  IF NOT EXISTS  table_with_partition(key int,value string) PARTITIONED by (ds string) location '${tmpDir.toURI.toString}' ")
 sql("INSERT OVERWRITE TABLE table_with_partition  partition (ds='1') SELECT key,value FROM testData")
    // add column to table
 sql("ALTER TABLE table_with_partition ADD COLUMNS(key1 string)")
 sql("ALTER TABLE table_with_partition ADD COLUMNS(destlng double)") 
 sql("INSERT OVERWRITE TABLE table_with_partition  partition (ds='1') SELECT key,value,'test',1.11 FROM testData")
 sql("select * from table_with_partition where ds='1' ").collect().foreach(println)	 
 
result : 
[1,1,null,null,1]
[2,2,null,null,1]
 
result we expect:
[1,1,test,1.11,1]
[2,2,test,1.11,1]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org