You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2021/05/27 21:46:00 UTC

[jira] [Commented] (HIVE-25129) Wrong results when timestamps stored in Avro/Parquet fall into the DST shift

    [ https://issues.apache.org/jira/browse/HIVE-25129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352774#comment-17352774 ] 

Stamatis Zampetakis commented on HIVE-25129:
--------------------------------------------

The problem is actually the same that is described in HIVE-20007 and HIVE-12192 and it was solved for some time. The regression reported here is due to HIVE-21290 ([commit|https://github.com/apache/hive/commit/10dfb151e9f2dfbdb4de254a99866261a922c479]).

> Wrong results when timestamps stored in Avro/Parquet fall into the DST shift
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-25129
>                 URL: https://issues.apache.org/jira/browse/HIVE-25129
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 3.1.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>         Attachments: parquet_timestamp_dst.q
>
>
> Timestamp values falling into the daylight savings time of the system timezone cannot be retrieved as is when those are stored in Parquet/Avro tables. The respective SELECT query shifts those timestamps by +1 reflecting the DST shift.
> +Example+
> {code:sql}
> --! qt:timezone:US/Pacific
> create table employee (eid int, birthdate timestamp) stored as parquet;
> insert into employee values (0, '2019-03-10 02:00:00');
> insert into employee values (1, '2020-03-08 02:00:00');
> insert into employee values (2, '2021-03-14 02:00:00');
> select eid, birthdate from employee order by eid;{code}
> +Actual results+
> |0|2019-03-10 03:00:00|
> |1|2020-03-08 03:00:00|
> |2|2021-03-14 03:00:00|
> +Expected results+
> |0|2019-03-10 02:00:00|
> |1|2020-03-08 02:00:00|
> |2|2021-03-14 02:00:00|
> Storing and retrieving values in columns using the [timestamp data type|https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types] (equivalent with LocalDateTime java API) should not alter at any way the value that the user is seeing. The results are correct for {{TEXTFILE}} and {{ORC}} tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)