You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/20 10:31:58 UTC

[GitHub] eatoncys opened a new pull request #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types

eatoncys opened a new pull request #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
URL: https://github.com/apache/spark/pull/23010
 
 
   ## What changes were proposed in this pull request?
   
   Dynamic partition will fail when both '' and null values are taken as dynamic partition values simultaneously.
   For example, the test bellow will fail before this PR:
   
     test("Null and '' values should not cause dynamic partition failure of string types") {
       withTable("t1", "t2") {
         spark.range(3).write.saveAsTable("t1")
         spark.sql("select id, cast(case when id = 1 then '' else null end as string) as p" +
           " from t1").write.partitionBy("p").saveAsTable("t2")
         checkAnswer(spark.table("t2").sort("id"), Seq(Row(0, null), Row(1, null), Row(2, null)))
       }
     }
   
   The error is: 'org.apache.hadoop.fs.FileAlreadyExistsException: File already exists'.
   This PR adds exception protection to file conflicts, renaming the file when files conflict.
   
   
   (Please fill in changes proposed in this fix)
   
   ## How was this patch tested?
   New added test.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org