You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Maxim Gekk (Jira)" <ji...@apache.org> on 2020/01/24 20:48:00 UTC

[jira] [Commented] (SPARK-30632) to_timestamp() doesn't work with certain timezones

    [ https://issues.apache.org/jira/browse/SPARK-30632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023263#comment-17023263 ] 

Maxim Gekk commented on SPARK-30632:
------------------------------------

Spark 2.4 and earlier versions use SimpleDateFormat to parse timestamp strings. Unfortunately, the class doesn't support time zones in the format like "America/Los_Angeles", see [https://stackoverflow.com/questions/23242211/java-simpledateformat-parse-timezone-like-america-los-angeles] . Spark 3.0 has migrated to DateTimeFormatter which doesn't have such issue. Port the changes back to Spark 2.4 is risky, and destabilizes it, IMHO.

> to_timestamp() doesn't work with certain timezones
> --------------------------------------------------
>
>                 Key: SPARK-30632
>                 URL: https://issues.apache.org/jira/browse/SPARK-30632
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0, 2.4.4
>            Reporter: Anton Daitche
>            Priority: Major
>
> It seams that to_timestamp() doesn't work with timezones of the type <Country>/<City>, e.g. America/Los_Angeles.
> The code
> {code:scala}
> val df = Seq(
>     ("2019-01-24 11:30:00.123", "America/Los_Angeles"), 
>     ("2020-01-01 01:30:00.123", "PST")
> ).toDF("ts_str", "tz_name")
> val ts_parsed = to_timestamp(
>     concat_ws(" ", $"ts_str", $"tz_name"), "yyyy-MM-dd HH:mm:ss.SSS z"
> ).as("timestamp")
> df.select(ts_parsed).show(false)
> {code}
> prints
> {code}
> +-------------------+
> |timestamp          |
> +-------------------+
> |null               |
> |2020-01-01 10:30:00|
> +-------------------+
> {code}
> So, the datetime string with timezone PST is properly parsed, whereas the one with America/Los_Angeles is converted to null. According to [this|https://github.com/apache/spark/pull/24195#issuecomment-578055146] response on GitHub, this code works when run on the recent master version. 
> See also the discussion in [this|https://github.com/apache/spark/pull/24195#issue] issue for more context.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org