You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/10/08 05:42:14 UTC
[jira] [Resolved] (SPARK-24969) SQL: to_date function can't parse
date strings in different locales.
[ https://issues.apache.org/jira/browse/SPARK-24969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-24969.
----------------------------------
Resolution: Incomplete
> SQL: to_date function can't parse date strings in different locales.
> --------------------------------------------------------------------
>
> Key: SPARK-24969
> URL: https://issues.apache.org/jira/browse/SPARK-24969
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.1
> Environment: Bare Spark 2.2.1 installation, on RHEL 6.
> Reporter: Valentino Pinna
> Priority: Major
> Labels: bulk-closed
>
> The locale for {{org.apache.spark.sql.catalyst.util.DateTimeUtils}}, that is internally used by {{to_date}} SQL function, is set in code to be {{Locale.US}}.
> This causes problems parsing a dataset which has dates in a different (italian in this case) language.
> {code:java}
> spark.read.format("csv")
> .option("sep", ";")
> .csv(logFile)
> .toDF("DATA", .....)
> .withColumn("DATA2", to_date(col("DATA"), "yyyy MMM"))
> .show(10)
> {code}
> Results from example dataset:
> |*DATA*|*DATA2*|
> |2018 giu|null|
> |2018 mag|null|
> |2018 apr|2018-04-01|
> |2018 mar|2018-03-01|
> |2018 feb|2018-02-01|
> |2018 gen|null|
> |2017 dic|null|
> |2017 nov|2017-11-01|
> |2017 ott|null|
> |2017 set|null|
> Expected results: All values converted.
> TEMPORARY WORKAROUND:
> In object {{org.apache.spark.sql.catalyst.util.DateTimeUtils}}, replace all instances of {{Locale.US}} with {{Locale.<your locale>}}
> ADDITIONAL NOTES:
> I can make a pull request available on GitHub.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org