You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2017/02/08 06:12:42 UTC
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird
behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857459#comment-15857459 ]
Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:12 AM:
--------------------------------------------------------------
- Hive
{code}
hive> SELECT to_date('2014-31-12');
2016-07-12
...
hive> SELECT to_date('2014-12-32');
2015-01-01
...
hive> SELECT to_date('2014-12-31');
2014-12-31
{code}
{code}
hive> SELECT to_date('2015-07-22', 'yyyy-dd-MM')
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''yyyy-dd-MM'': to_date() requires 1 argument, got 2
...
{code}
- Postgres
{code}
postgres=# SELECT to_date('2014-12-31');
ERROR: function to_date(unknown) does not exist
LINE 1: SELECT to_date('2014-12-31');
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
postgres=# SELECT to_date('2014-12-31', 'yyyy-MM-dd');
to_date
------------
2014-12-31
(1 row)
postgres=# SELECT to_date('2014-13-31', 'yyyy-MM-dd');
to_date
------------
2015-01-31
(1 row)
{code}
- Spark
{code}
spark-sql> SELECT to_date('2014-31-12');
NULL
...
spark-sql> SELECT to_date('2014-12-32');
NULL
...
spark-sql> SELECT to_date('2014-12-31');
2014-12-31
...
{code}
{code}
spark-sql> SELECT to_date('2015-07-22', 'yyyy-dd-MM')
2016-10-07
...
{code}
- MySQL
{code}
mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d');
+---------------------------------------+
| str_to_date('2014-12-31', '%Y-%m-%d') |
+---------------------------------------+
| 2014-12-31 |
+---------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d');
+---------------------------------------+
| str_to_date('2014-13-31', '%Y-%m-%d') |
+---------------------------------------+
| NULL |
+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
{code}
MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format.
Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format.
was (Author: hyukjin.kwon):
- Hive
{code}
hive> SELECT to_date('2014-31-12');
2016-07-12
...
hive> SELECT to_date('2014-12-32');
2015-01-01
...
hive> SELECT to_date('2014-12-31');
2014-12-31
{code}
- Postgres
{code}
postgres=# SELECT to_date('2014-12-31');
ERROR: function to_date(unknown) does not exist
LINE 1: SELECT to_date('2014-12-31');
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
postgres=# SELECT to_date('2014-12-31', 'yyyy-MM-dd');
to_date
------------
2014-12-31
(1 row)
postgres=# SELECT to_date('2014-13-31', 'yyyy-MM-dd');
to_date
------------
2015-01-31
(1 row)
{code}
- Spark
{code}
spark-sql> SELECT to_date('2014-31-12');
NULL
...
spark-sql> SELECT to_date('2014-12-32');
NULL
...
spark-sql> SELECT to_date('2014-12-31');
2014-12-31
...
{code}
- MySQL
{code}
mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d');
+---------------------------------------+
| str_to_date('2014-12-31', '%Y-%m-%d') |
+---------------------------------------+
| 2014-12-31 |
+---------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d');
+---------------------------------------+
| str_to_date('2014-13-31', '%Y-%m-%d') |
+---------------------------------------+
| NULL |
+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
{code}
MySQL/SparkSQL - it seems returning {{NULL}}.
Hive/Postgres - it seems returning calculated dates.
> to_date with format has weird behavior
> --------------------------------------
>
> Key: SPARK-19496
> URL: https://issues.apache.org/jira/browse/SPARK-19496
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Wenchen Fan
>
> Today, if we run
> {code}
> SELECT to_date('2015-07-22', 'yyyy-dd-MM')
> {code}
> will result to `2016-10-07`, while running
> {code}
> SELECT to_date('2014-31-12') # default format
> {code}
> will return null.
> this behavior is weird and we should check other systems like hive to see if this is expected.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org