You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (JIRA)" <ji...@apache.org> on 2018/10/18 21:01:00 UTC

[jira] [Created] (IMPALA-7723) Recognize int64 timestamps in CREATE TABLE LIKE PARQUET

Csaba Ringhofer created IMPALA-7723:
---------------------------------------

             Summary: Recognize int64 timestamps in CREATE TABLE LIKE PARQUET
                 Key: IMPALA-7723
                 URL: https://issues.apache.org/jira/browse/IMPALA-7723
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Csaba Ringhofer


IMPALA-5050 adds support for reading int64 encoded Parquet timestamps. These columns have int64 physical type, and converted/logical types has to be used to differentiate them from BIGINTs. These columns can be read both as BIGINTs and TIMESTAMPs depending on the table's schema.

CREATE TABLE LIKE PARQUET could also convert these columns to TIMESTAMP instead of BIGINT, but I decided to postpone adding this feature for two reasons:

1. It could break the following possible workflow:
- generate Parquet files (that contain int64 timestamps) with some tool
- use Impala's CREATE TABLE LIKE PARQUET + LOAD DATA to make it accessible as a table
- run some queries that rely on interpreting these columns as integers

CAST (col as BIGINT) in the query would make this even worse, as it would convert timestamp to unix time in seconds instead of micros/millis without any warning.

2. Adding support for int64 timestamps with nanoseconds precision will need Impala's  parquet-hadoop-bundle dependency to be bumped to a new major version, which may contain incompatible API changes.

Note that parquet-hadoop-bundle is only used in CREATE TABLE LIKE PARQUET. The C++ parts of Impala only rely on parquet.thrift, which can be updated more easily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org