You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (JIRA)" <ji...@apache.org> on 2018/06/28 12:54:00 UTC

[jira] [Comment Edited] (IMPALA-5050) Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner

    [ https://issues.apache.org/jira/browse/IMPALA-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526270#comment-16526270 ] 

Csaba Ringhofer edited comment on IMPALA-5050 at 6/28/18 12:53 PM:
-------------------------------------------------------------------

[~lv] [~zi]
I have started a prototype implementation for the reading of int64 TIMESTAMP_MICROS which converts int64 to TimestampValue in ParquetColumnReader.

The main motivation is to have a timestamp type in Parquet that stores whether utc->local conversion is needed or not. This information is stored in the new logical types in TimestampType::isAdjustedToUTC. For the old TIMESTAMP_MICROS, it would be decided based on flag convert_legacy_hive_parquet_utc_timestamps.

INT64 TIMESTAMP_MICROS columns could be read both as bigint and as timestamp depending on the SQL column types. This means that no existing query would be broken. Treating these columns as integer will be probably also much faster than treating them as timestamp while there is no Int64Value in Impala backend.



was (Author: csringhofer):
[~lv] [~zi]
I have started a prototype implementation for the reading of int64 TIMESTAMP_MICROS which converts int64 to TimestampValue in ParquetColumnReader.

The main motivation is to have a timestamp type in Parquet that stores whether utc->local conversion is needed or not. This information is stored in the new logical types in TimestampType::isAdjustedToUTC. For the old TIMESTAMP_MICROS, it would be decided based on flag convert_legacy_hive_parquet_utc_timestamps.

INT64 TIMESTAMP_MICROS columns could be read both as integer and as timestamp depending on the SQL column types. This means that no existing query would be broken. Treating these columns as integer will be probably also much faster than treating them as timestamp while there is no Int64Value in Impala backend.


> Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner
> --------------------------------------------------------------------------------
>
>                 Key: IMPALA-5050
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5050
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>    Affects Versions: Impala 2.9.0
>            Reporter: Lars Volker
>            Assignee: Csaba Ringhofer
>            Priority: Major
>
> This requires updating {{parquet.thrift}} to a version that includes the {{TIMESTAMP_MICROS}} logical type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org