You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2020/05/31 12:46:00 UTC

[jira] [Resolved] (SPARK-31867) Fix silent data change for datetime formatting

     [ https://issues.apache.org/jira/browse/SPARK-31867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-31867.
---------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 28684
[https://github.com/apache/spark/pull/28684]

> Fix silent data change for datetime formatting 
> -----------------------------------------------
>
>                 Key: SPARK-31867
>                 URL: https://issues.apache.org/jira/browse/SPARK-31867
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Blocker
>             Fix For: 3.0.0
>
>
> {code:java}
> spark-sql> select from_unixtime(1, 'yyyyyyyyyyy-MM-dd');
> NULL
> spark-sql> set spark.sql.legacy.timeParserPolicy=legacy;
> spark.sql.legacy.timeParserPolicy	legacy
> spark-sql> select from_unixtime(1, 'yyyyyyyyyyy-MM-dd');
> 00000001970-01-01
> spark-sql>
> {code}
> For patterns that support `SignStyle.EXCEEDS_PAD`, e.g. `y..y`(len >=4), when using the `NumberPrinterParser` to format it
> {code:java}
> switch (signStyle) {
>   case EXCEEDS_PAD:
>     if (minWidth < 19 && value >= EXCEED_POINTS[minWidth]) {
>       buf.append(decimalStyle.getPositiveSign());
>     }
>     break;
>    
>            ....
> {code}
> the `minWidth` == `len(y..y)`
> the `EXCEED_POINTS` is 
> {code:java}
> /**
>          * Array of 10 to the power of n.
>          */
>         static final long[] EXCEED_POINTS = new long[] {
>             0L,
>             10L,
>             100L,
>             1000L,
>             10000L,
>             100000L,
>             1000000L,
>             10000000L,
>             100000000L,
>             1000000000L,
>             10000000000L,
>         };
> {code}
> So when the `len(y..y)` is greater than 10, ` ArrayIndexOutOfBoundsException` will be raised.
>  And at the caller side, for `from_unixtime`, the exception will be suppressed and silent data change occurs. for `date_format`, the `ArrayIndexOutOfBoundsException` will continue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org