You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2023/03/30 15:01:00 UTC

[jira] [Created] (HIVE-27199) Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom formats

Stamatis Zampetakis created HIVE-27199:
------------------------------------------

             Summary: Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom formats
                 Key: HIVE-27199
                 URL: https://issues.apache.org/jira/browse/HIVE-27199
             Project: Hive
          Issue Type: Improvement
          Components: Serializers/Deserializers
    Affects Versions: 4.0.0-alpha-2
            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis


Timestamp values come in many flavors and formats and there is no single representation that can satisfy everyone especially when such values are stored in plain text/csv files.

HIVE-9298, added a special SERDE property, {{{}timestamp.formats{}}}, that allows to provide custom timestamp patterns to parse correctly TIMESTAMP values coming from files.

However, when the column type is TIMESTAMP WITH LOCAL TIME ZONE (LTZ) it is not possible to use a custom pattern thus when the built-in Hive parser does not match the expected format a NULL value is returned.

Consider a text file, F1, with the following values:
{noformat}
2016-05-03 12:26:34
2016-05-03T12:26:34
{noformat}
and a table with a column declared as LTZ.
{code:sql}
CREATE TABLE ts_table (ts TIMESTAMP WITH LOCAL TIME ZONE);
LOAD DATA LOCAL INPATH './F1' INTO TABLE ts_table;

SELECT * FROM ts_table;
2016-05-03 12:26:34.0 US/Pacific
NULL
{code}
In order to give more flexibility to the users relying on the TIMESTAMP WITH LOCAL TIME ZONE datatype and also align the behavior with the TIMESTAMP type this JIRA aims to reuse the {{timestamp.formats}} property for both TIMESTAMP types.

The work here focuses exclusively on simple text files but the same could be done for other SERDE such as JSON etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)