You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zoltán Rátkai (Jira)" <ji...@apache.org> on 2024/01/08 16:44:00 UTC

[jira] [Work started] (HIVE-27739) Multiple issues with timestamps with timezone - can lead to data inconsistency

     [ https://issues.apache.org/jira/browse/HIVE-27739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-27739 started by Zoltán Rátkai.
--------------------------------------------
> Multiple issues with timestamps with timezone - can lead to data inconsistency
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-27739
>                 URL: https://issues.apache.org/jira/browse/HIVE-27739
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0-beta-1
>            Reporter: Janos Kovacs
>            Assignee: Zoltán Rátkai
>            Priority: Major
>
> The following issues were found testing timestamps with timezonse:
>  * CREATE TABLE fails with SemanticException when hive.local.time.zone is set to different valid value in the session
>  * Invalid timezone values (e.g. with typo) treated as UTC which can lead to data consistency / loss issues
>  * LOCAL is an invalid timezone value and treated as UTC instead of treating as system's timezone
> The issues are tracked as sub-tasks.
> in general, base tests are:
> {noformat}
> SELECT
>   '\${system:user.timezone}' as os,
>   '\${hiveconf:hive.local.time.zone}' as hive,
>   'TZ'  as branch,
>   tz    as orig,
>   to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}') as to_utc, 
>   from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'Europe/Budapest')  as to_bp,
>   from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'America/Los_Angeles') as to_la
> FROM timestamptest;
> "
> {noformat}
>  
> The results are:
> {noformat}
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> |        os        |       hive       | branch  |                  orig                  |         to_utc         |         to_bp          |         to_la          |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest  | Europe/Budapest  | TZ      | 2016-01-03 21:26:34.0 Europe/Budapest  | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  | 2016-01-03 12:26:34.0  |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest  | UTC              | TZ      | 2016-01-03 20:26:34.0 UTC              | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  | 2016-01-03 12:26:34.0  |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest  | LOCAL            | TZ      | 2016-01-03 21:26:34.0 Europe/Budapest  | 2016-01-03 21:26:34.0  | 2016-01-03 22:26:34.0  | 2016-01-03 13:26:34.0  | !!!
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC              | Europe/Budapest  | TZ      | 2016-01-03 21:26:34.0 Europe/Budapest  | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  | 2016-01-03 12:26:34.0  |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC              | UTC              | TZ      | 2016-01-03 20:26:34.0 UTC              | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  | 2016-01-03 12:26:34.0  |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC              | LOCAL            | TZ      | 2016-01-03 20:26:34.0 UTC              | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  | 2016-01-03 12:26:34.0  | !!!
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> {noformat}
> The problematic cases:
>  * the "Europe/Budapest | LOCAL" case is wrong, LOCAL is treated as UTC instead of system's TZ which makes 1h offset when converted
>  * the "UTC | LOCAL" case is only good because LOCAL is treated as UTC all the time
> Repro code and more details are in each of the subtask tickets



--
This message was sent by Atlassian Jira
(v8.20.10#820010)