You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/11/14 20:24:01 UTC
[jira] [Commented] (IMPALA-5050) Add support to read
TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner
[ https://issues.apache.org/jira/browse/IMPALA-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687081#comment-16687081 ]
ASF subversion and git services commented on IMPALA-5050:
---------------------------------------------------------
Commit 60095a4c6bebc412a040d5b4a723e528ba0b2278 in impala's branch refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=60095a4 ]
IMPALA-5050: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS from Parquet
Changes:
- parquet.thrift is updated to a newer version which contains the
timestamp logical type.
- INT64 columns with converted types TIMESTAMP_MILLIS and
TIMESTAMP_MICROS can be read as TIMESTAMP.
- If the logical type is timestamp, then the type will contain the
information whether the UTC->local conversion is necessary. This
feature is only supported for the new timestamp types, so INT96
timestamps must still use flag
convert_legacy_hive_parquet_utc_timestamps.
- Min/max stat filtering is enabled again for columns that need
UTC->local conversion. This was disabled in IMPALA-7559 because
it could incorrectly drop column chunks.
- CREATE TABLE LIKE PARQUET converts these columns to
TIMESTAMP - before the change, an error was returned instead.
- Bulk of the Parquet column stat logic was moved to a new class
called "ColumnStatsReader".
Testing:
- Added unit tests for timezone conversion (this needed a new public
function in timezone_db.h and adding CET to tzdb_tiny).
- Added parquet files (created with parquet-mr) with int64 timestamp
columns.
Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Reviewed-on: http://gerrit.cloudera.org:8080/11057
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
> Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner
> --------------------------------------------------------------------------------
>
> Key: IMPALA-5050
> URL: https://issues.apache.org/jira/browse/IMPALA-5050
> Project: IMPALA
> Issue Type: New Feature
> Components: Backend
> Affects Versions: Impala 2.9.0
> Reporter: Lars Volker
> Assignee: Csaba Ringhofer
> Priority: Major
>
> This requires updating {{parquet.thrift}} to a version that includes the {{TIMESTAMP_MICROS}} logical type.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org