You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2020/05/10 19:23:57 UTC
[spark] branch branch-3.0 updated: [SPARK-31669][SQL][TESTS] Fix
RowEncoderSuite failures on non-existing dates/timestamps
This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 6f7c719 [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps
6f7c719 is described below
commit 6f7c71947073f147bc35da196139d5ceb6fbdf45
Author: Max Gekk <ma...@gmail.com>
AuthorDate: Sun May 10 14:22:12 2020 -0500
[SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps
### What changes were proposed in this pull request?
Shift non-existing dates in Proleptic Gregorian calendar by 1 day. The reason for that is `RowEncoderSuite` generates random dates/timestamps in the hybrid calendar, and some dates/timestamps don't exist in Proleptic Gregorian calendar like 1000-02-29 because 1000 is not leap year in Proleptic Gregorian calendar.
### Why are the changes needed?
This makes RowEncoderSuite much stable.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
By running RowEncoderSuite and set non-existing date manually:
```scala
val date = new java.sql.Date(1000 - 1900, 1, 29)
Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + MILLIS_PER_DAY))
```
Closes #28486 from MaxGekk/fix-RowEncoderSuite.
Authored-by: Max Gekk <ma...@gmail.com>
Signed-off-by: Sean Owen <sr...@gmail.com>
(cherry picked from commit 9f768fa9916dec3cc695e3f28ec77148d81d335f)
Signed-off-by: Sean Owen <sr...@gmail.com>
---
.../org/apache/spark/sql/RandomDataGenerator.scala | 23 +++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index a7c20c3..5a4d23d 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -18,9 +18,10 @@
package org.apache.spark.sql
import java.math.MathContext
+import java.sql.{Date, Timestamp}
import scala.collection.mutable
-import scala.util.Random
+import scala.util.{Random, Try}
import org.apache.spark.sql.catalyst.CatalystTypeConverters
import org.apache.spark.sql.catalyst.util.DateTimeConstants.MILLIS_PER_DAY
@@ -172,7 +173,15 @@ object RandomDataGenerator {
// January 1, 1970, 00:00:00 GMT for "9999-12-31 23:59:59.999999".
milliseconds = rand.nextLong() % 253402329599999L
}
- DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+ val date = DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+ // The generated `date` is based on the hybrid calendar Julian + Gregorian since
+ // 1582-10-15 but it should be valid in Proleptic Gregorian calendar too which is used
+ // by Spark SQL since version 3.0 (see SPARK-26651). We try to convert `date` to
+ // a local date in Proleptic Gregorian calendar to satisfy this requirement.
+ // Some years are leap years in Julian calendar but not in Proleptic Gregorian calendar.
+ // As the consequence of that, 29 February of such years might not exist in Proleptic
+ // Gregorian calendar. When this happens, we shift the date by one day.
+ Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + MILLIS_PER_DAY))
}
Some(generator)
case TimestampType =>
@@ -188,7 +197,15 @@ object RandomDataGenerator {
milliseconds = rand.nextLong() % 253402329599999L
}
// DateTimeUtils.toJavaTimestamp takes microsecond.
- DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+ val ts = DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+ // The generated `ts` is based on the hybrid calendar Julian + Gregorian since
+ // 1582-10-15 but it should be valid in Proleptic Gregorian calendar too which is used
+ // by Spark SQL since version 3.0 (see SPARK-26651). We try to convert `ts` to
+ // a local timestamp in Proleptic Gregorian calendar to satisfy this requirement.
+ // Some years are leap years in Julian calendar but not in Proleptic Gregorian calendar.
+ // As the consequence of that, 29 February of such years might not exist in Proleptic
+ // Gregorian calendar. When this happens, we shift the timestamp `ts` by one day.
+ Try { ts.toLocalDateTime; ts }.getOrElse(new Timestamp(ts.getTime + MILLIS_PER_DAY))
}
Some(generator)
case CalendarIntervalType => Some(() => {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org