You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2020/05/10 19:23:57 UTC

[spark] branch branch-3.0 updated: [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 6f7c719  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps
6f7c719 is described below

commit 6f7c71947073f147bc35da196139d5ceb6fbdf45
Author: Max Gekk <ma...@gmail.com>
AuthorDate: Sun May 10 14:22:12 2020 -0500

    [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps
    
    ### What changes were proposed in this pull request?
    Shift non-existing dates in Proleptic Gregorian calendar by 1 day. The reason for that is `RowEncoderSuite` generates random dates/timestamps in the hybrid calendar, and some dates/timestamps don't exist in Proleptic Gregorian calendar like 1000-02-29 because 1000 is not leap year in Proleptic Gregorian calendar.
    
    ### Why are the changes needed?
    This makes RowEncoderSuite much stable.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    By running RowEncoderSuite and set non-existing date manually:
    ```scala
    val date = new java.sql.Date(1000 - 1900, 1, 29)
    Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + MILLIS_PER_DAY))
    ```
    
    Closes #28486 from MaxGekk/fix-RowEncoderSuite.
    
    Authored-by: Max Gekk <ma...@gmail.com>
    Signed-off-by: Sean Owen <sr...@gmail.com>
    (cherry picked from commit 9f768fa9916dec3cc695e3f28ec77148d81d335f)
    Signed-off-by: Sean Owen <sr...@gmail.com>
---
 .../org/apache/spark/sql/RandomDataGenerator.scala | 23 +++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index a7c20c3..5a4d23d 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -18,9 +18,10 @@
 package org.apache.spark.sql
 
 import java.math.MathContext
+import java.sql.{Date, Timestamp}
 
 import scala.collection.mutable
-import scala.util.Random
+import scala.util.{Random, Try}
 
 import org.apache.spark.sql.catalyst.CatalystTypeConverters
 import org.apache.spark.sql.catalyst.util.DateTimeConstants.MILLIS_PER_DAY
@@ -172,7 +173,15 @@ object RandomDataGenerator {
               // January 1, 1970, 00:00:00 GMT for "9999-12-31 23:59:59.999999".
               milliseconds = rand.nextLong() % 253402329599999L
             }
-            DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+            val date = DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+            // The generated `date` is based on the hybrid calendar Julian + Gregorian since
+            // 1582-10-15 but it should be valid in Proleptic Gregorian calendar too which is used
+            // by Spark SQL since version 3.0 (see SPARK-26651). We try to convert `date` to
+            // a local date in Proleptic Gregorian calendar to satisfy this requirement.
+            // Some years are leap years in Julian calendar but not in Proleptic Gregorian calendar.
+            // As the consequence of that, 29 February of such years might not exist in Proleptic
+            // Gregorian calendar. When this happens, we shift the date by one day.
+            Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + MILLIS_PER_DAY))
           }
         Some(generator)
       case TimestampType =>
@@ -188,7 +197,15 @@ object RandomDataGenerator {
               milliseconds = rand.nextLong() % 253402329599999L
             }
             // DateTimeUtils.toJavaTimestamp takes microsecond.
-            DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+            val ts = DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+            // The generated `ts` is based on the hybrid calendar Julian + Gregorian since
+            // 1582-10-15 but it should be valid in Proleptic Gregorian calendar too which is used
+            // by Spark SQL since version 3.0 (see SPARK-26651). We try to convert `ts` to
+            // a local timestamp in Proleptic Gregorian calendar to satisfy this requirement.
+            // Some years are leap years in Julian calendar but not in Proleptic Gregorian calendar.
+            // As the consequence of that, 29 February of such years might not exist in Proleptic
+            // Gregorian calendar. When this happens, we shift the timestamp `ts` by one day.
+            Try { ts.toLocalDateTime; ts }.getOrElse(new Timestamp(ts.getTime + MILLIS_PER_DAY))
           }
         Some(generator)
       case CalendarIntervalType => Some(() => {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org