You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rajkumar Singh (Jira)" <ji...@apache.org> on 2020/01/30 23:05:00 UTC
[jira] [Created] (SPARK-30688) Spark SQL Unix Timestamp produces
incorrect result with unix_timestamp UDF
Rajkumar Singh created SPARK-30688:
--------------------------------------
Summary: Spark SQL Unix Timestamp produces incorrect result with unix_timestamp UDF
Key: SPARK-30688
URL: https://issues.apache.org/jira/browse/SPARK-30688
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.3.2
Reporter: Rajkumar Singh
{code:java}
scala> spark.sql("select unix_timestamp('20201', 'yyyyww')").show();
+-----------------------------+
|unix_timestamp(20201, yyyyww)|
+-----------------------------+
| null|
+-----------------------------+
scala> spark.sql("select unix_timestamp('20202', 'yyyyww')").show();
-----------------------------+
|unix_timestamp(20202, yyyyww)|
+-----------------------------+
| 1578182400|
+-----------------------------+
{code}
This seems to happen for leap year only, I dig deeper into it and it seems that Spark is using the java.text.SimpleDateFormat and try to parse the expression here
[org.apache.spark.sql.catalyst.expressions.UnixTime#eval|https://github.com/hortonworks/spark2/blob/49ec35bbb40ec6220282d932c9411773228725be/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L652]
{code:java}
formatter.parse(
t.asInstanceOf[UTF8String].toString).getTime / 1000L{code}
but fail and SimpleDateFormat unable to parse the date throw Unparseable Exception but Spark handle it silently and returns NULL.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org