You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vincent Tran (JIRA)" <ji...@apache.org> on 2018/07/12 18:37:00 UTC

[jira] [Created] (HIVE-20154) Improve unix_timestamp(args) to handle automatic-DST switching timezones

Vincent Tran created HIVE-20154:
-----------------------------------

             Summary: Improve unix_timestamp(args) to handle automatic-DST switching timezones
                 Key: HIVE-20154
                 URL: https://issues.apache.org/jira/browse/HIVE-20154
             Project: Hive
          Issue Type: Improvement
            Reporter: Vincent Tran


Currently unix_timestamp(args) UDF will only handle static timezone specifiers. It does not recognize SystemV specifiers such as EST5EDT or PST8PDT.

Based on this experiment, when z is used to parse a TZ string like UTC4PDT (obviously not a valid SystemV specifier) - it will parse the time as UTC.
When z<n>z is used to parse a TZ string like UTC4PDT, it will take parse the timestamp as the TZ of the final z position. This is demonstrated by my final query when the format string z4z1z is used to parse UTC4PDT1EDT.

{noformat}
0: jdbc:hive2://localhost:10000/default>; select from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "yyyy-MM-dd HH:mm:ss z"), "yyyy-MM-dd HH:mm:ss zzzz");
+--------------------------------------------+--+
|                    _c0                     |
+--------------------------------------------+--+
| 2018-01-31 16:00:00 Pacific Standard Time  |
+--------------------------------------------+--+
1 row selected (0.041 seconds)
0: jdbc:hive2://localhost:10000/default>; select from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "yyyy-MM-dd HH:mm:ss z"), "yyyy-MM-dd HH:mm:ss zzzz");
+--------------------------------------------+--+
|                    _c0                     |
+--------------------------------------------+--+
| 2018-01-31 16:00:00 Pacific Standard Time  |
+--------------------------------------------+--+
1 row selected (0.041 seconds)
0: jdbc:hive2://localhost:10000/default>; select from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "yyyy-MM-dd HH:mm:ss z4z"), "yyyy-MM-dd HH:mm:ss zzzz");
+--------------------------------------------+--+
|                    _c0                     |
+--------------------------------------------+--+
| 2018-01-31 23:00:00 Pacific Standard Time  |
+--------------------------------------------+--+
1 row selected (0.047 seconds)
0: jdbc:hive2://localhost:10000/default>; select from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "yyyy-MM-dd HH:mm:ss z4z1z"), "yyyy-MM-dd HH:mm:ss zzzz");
+--------------------------------------------+--+
|                    _c0                     |
+--------------------------------------------+--+
| 2018-01-31 20:00:00 Pacific Standard Time  |
+--------------------------------------------+--+
1 row selected (0.055 seconds)
0: jdbc:hive2://localhost:10000/default>;
{noformat}



So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are valid to unix_timestamp(args) at all. And that those when parsed with the z<n>z format string, will be read as whatever valid timezone at the final position (effectively EDT and PDT respectively in when those valid SystemV TZ specifiers above are used).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)