You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/05/30 14:36:44 UTC

[GitHub] [spark] cxzl25 commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

cxzl25 commented on code in PR #32959:
URL: https://github.com/apache/spark/pull/32959#discussion_r884894581


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala:
##########
@@ -518,44 +516,55 @@ object DateTimeUtils {
    * The return type is [[Option]] in order to distinguish between 0 and null. The following
    * formats are allowed:
    *
-   * `yyyy`
-   * `yyyy-[m]m`
-   * `yyyy-[m]m-[d]d`
-   * `yyyy-[m]m-[d]d `
-   * `yyyy-[m]m-[d]d *`
-   * `yyyy-[m]m-[d]dT*`
+   * `[+-]yyyy*`
+   * `[+-]yyyy*-[m]m`
+   * `[+-]yyyy*-[m]m-[d]d`
+   * `[+-]yyyy*-[m]m-[d]d `
+   * `[+-]yyyy*-[m]m-[d]d *`
+   * `[+-]yyyy*-[m]m-[d]dT*`
    */
   def stringToDate(s: UTF8String): Option[Int] = {
-    if (s == null) {
+    def isValidDigits(segment: Int, digits: Int): Boolean = {
+      // An integer is able to represent a date within [+-]5 million years.
+      var maxDigitsYear = 7

Review Comment:
   Can I implement a configuration item that configures the range of digits allowed for the year?
   
   I found that it was writing to tables in different formats and the results would behave differently.
   
   ```sql
   create table t(c1 date) stored as textfile;
   insert overwrite table t select cast( '22022-05-01' as date);
   select * from t1; -- output null
   ```
   ```sql
   create table t(c1 date) stored as orcfile;
   insert overwrite table t select cast( '22022-05-01' as date);
   select * from t1; -- output +22022-05-01
   ```
   Because orc/parquet date stores integers, but textfile and sequencefile store text.
   
   ![image](https://user-images.githubusercontent.com/3898450/171014098-734545c9-cbfd-4a5d-9f91-bef402a42197.png)
   
   
   
   But if you use hive jdbc, the query will fail, because `java.sql.Date` only supports 4-digit years.
   
   ```
   Caused by: java.lang.IllegalArgumentException
     at java.sql.Date.valueOf(Date.java:143)
     at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:447
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org