You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "JinxinTang (Jira)" <ji...@apache.org> on 2020/05/01 07:00:00 UTC
[jira] [Commented] (SPARK-31598) LegacySimpleTimestampFormatter
incorrectly interprets pre-Gregorian timestamps
[ https://issues.apache.org/jira/browse/SPARK-31598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097202#comment-17097202 ]
JinxinTang commented on SPARK-31598:
------------------------------------
already fix: [#anchor]https://issues.apache.org/jira/browse/SPARK-31557
> LegacySimpleTimestampFormatter incorrectly interprets pre-Gregorian timestamps
> ------------------------------------------------------------------------------
>
> Key: SPARK-31598
> URL: https://issues.apache.org/jira/browse/SPARK-31598
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0, 3.1.0
> Reporter: Bruce Robbins
> Priority: Major
>
> As per discussion with [~maxgekk]:
> {{LegacySimpleTimestampFormatter#parse}} misinterprets pre-Gregorian timestamps:
> {noformat}
> scala> sql("set spark.sql.legacy.timeParserPolicy=LEGACY")
> res0: org.apache.spark.sql.DataFrame = [key: string, value: string]
> scala> val df1 = Seq("0002-01-01 00:00:00", "1000-01-01 00:00:00", "1800-01-01 00:00:00").toDF("expected")
> df1: org.apache.spark.sql.DataFrame = [expected: string]
> scala> val df2 = df1.select('expected, to_timestamp('expected, "yyyy-MM-dd HH:mm:ss").as("actual"))
> df2: org.apache.spark.sql.DataFrame = [expected: string, actual: timestamp]
> scala> df2.show(truncate=false)
> +-------------------+-------------------+
> |expected |actual |
> +-------------------+-------------------+
> |0002-01-01 00:00:00|0001-12-30 00:00:00|
> |1000-01-01 00:00:00|1000-01-06 00:00:00|
> |1800-01-01 00:00:00|1800-01-01 00:00:00|
> +-------------------+-------------------+
> scala>
> {noformat}
> Legacy timestamp parsing with JSON and CSV files is correct, so apparently {{LegacyFastTimestampFormatter}} does not have this issue (need to double check).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org