You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/04/04 12:48:53 UTC

[GitHub] [spark] cloud-fan commented on a change in pull request #24284: [SPARK-27356][SQL] File source V2: Fix the case that data columns overlap with partition schema

cloud-fan commented on a change in pull request #24284: [SPARK-27356][SQL] File source V2: Fix the case that data columns overlap with partition schema
URL: https://github.com/apache/spark/pull/24284#discussion_r272161233
 
 

 ##########
 File path: docs/sql-migration-guide-upgrade.md
 ##########
 @@ -50,6 +50,8 @@ license: |
 
   - In Spark version 2.4 and earlier, JSON datasource and JSON functions like `from_json` convert a bad JSON record to a row with all `null`s in the PERMISSIVE mode when specified schema is `StructType`. Since Spark 3.0, the returned row can contain non-`null` fields if some of JSON column values were parsed and converted to desired types successfully.
 
+  - In Spark version 2.4 and earlier, if data columns overlap with partition columns, the output schema of file scan respects the ordering of data columns, and adopts the data type of partition columns. Since Spark 3.0, the output schema of file scan puts all the partition columns at the end. For example, if the data schema is `[a: String, b: String, c: String]` and the partition schema is `[b: Int, d: Int]`, the result schema is `[a: String, b: Int, c: String, d: Int]` in Spark 2.4 and earlier, and `[a: String, c: String, b: Int, d: Int]` since Spark 3.0.
 
 Review comment:
   do we need migration guide? it's a behavior change for file source v2, which is new in Spark 3.0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org