You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/15 08:46:02 UTC
[jira] [Updated] (SPARK-16239) SQL issues with cast from date to
string around daylight savings time
[ https://issues.apache.org/jira/browse/SPARK-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-16239:
---------------------------------
Priority: Major (was: Critical)
> SQL issues with cast from date to string around daylight savings time
> ---------------------------------------------------------------------
>
> Key: SPARK-16239
> URL: https://issues.apache.org/jira/browse/SPARK-16239
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.1
> Reporter: Glen Maisey
> Priority: Major
>
> Hi all,
> I have a dataframe with a date column. When I cast to a string using the spark sql cast function it converts it to the wrong date on certain days. Looking into it, it occurs once a year when summer daylight savings starts.
> I've tried to show this issue the code below. The toString() function works correctly whereas the cast does not.
> Unfortunately my users are using SQL code rather than scala dataframes and therefore this workaround does not apply. This was actually picked up where a user was writing something like "SELECT date1 UNION ALL select date2" where date1 was a string and date2 was a date type. It must be implicitly converting the date to a string which gives this error.
> I'm in the Australia/Sydney timezone (see the time changes here http://www.timeanddate.com/time/zone/australia/sydney)
> val dates = Array("2014-10-03","2014-10-04","2014-10-05","2014-10-06","2015-10-02","2015-10-03", "2015-10-04", "2015-10-05")
> val df = sc.parallelize(dates)
> .toDF("txn_date")
> .select(col("txn_date").cast("Date"))
> df.select(
> col("txn_date"),
> col("txn_date").cast("Timestamp").alias("txn_date_timestamp"),
> col("txn_date").cast("String").alias("txn_date_str_cast"),
> col("txn_date".toString()).alias("txn_date_str_toString")
> )
> .show()
> +----------+--------------------+-----------------+---------------------+
> | txn_date| txn_date_timestamp|txn_date_str_cast|txn_date_str_toString|
> +----------+--------------------+-----------------+---------------------+
> |2014-10-03|2014-10-02 14:00:...| 2014-10-03| 2014-10-03|
> |2014-10-04|2014-10-03 14:00:...| 2014-10-04| 2014-10-04|
> |2014-10-05|2014-10-04 13:00:...| 2014-10-04| 2014-10-05|
> |2014-10-06|2014-10-05 13:00:...| 2014-10-06| 2014-10-06|
> |2015-10-02|2015-10-01 14:00:...| 2015-10-02| 2015-10-02|
> |2015-10-03|2015-10-02 14:00:...| 2015-10-03| 2015-10-03|
> |2015-10-04|2015-10-03 13:00:...| 2015-10-03| 2015-10-04|
> |2015-10-05|2015-10-04 13:00:...| 2015-10-05| 2015-10-05|
> +----------+--------------------+-----------------+---------------------+
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org