You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Jesus Camacho Rodriguez (JIRA)" <ji...@apache.org> on 2017/10/27 00:29:00 UTC

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

    [ https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221507#comment-16221507 ] 

Jesus Camacho Rodriguez edited comment on HIVE-12192 at 10/27/17 12:28 AM:
---------------------------------------------------------------------------

Very much wip but I had been working on this so I attach a draft to see what ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a couple of constructors for the reader/writer of timestamp values need to be extended so we can specify the timezone ourselves instead of taking the system timezone automatically.


was (Author: jcamachorodriguez):
Very much wip but I had been working on this so I attach a draft to see what ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a couple of constructors for the reader/writer of timestamp values need to be extended so we can specify the timezone ourselves instead of taking the system timezone.

> Hive should carry out timestamp computations in UTC
> ---------------------------------------------------
>
>                 Key: HIVE-12192
>                 URL: https://issues.apache.org/jira/browse/HIVE-12192
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Hive
>            Reporter: Ryan Blue
>            Assignee: Jesus Camacho Rodriguez
>              Labels: timestamp
>         Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use {{Timestamp#getYear()}} and similar methods to implement SQL functions like {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles that alternates between PST and PDT, there are times that cannot be represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a java.sql.Timestamp avoids this bug, while still returning correct values for {{getYear}} etc. Using UTC as the convenience representation (timestamp without time zone has no real zone) would make timestamp calculations more consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the result is with respect to ["the default timezone and default locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions]. That function would need to be updated to use the {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)