You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/04/21 12:51:00 UTC

[jira] [Commented] (IMPALA-10850) Interpret timestamp predicates in local timezone in IcebergScanNode

    [ https://issues.apache.org/jira/browse/IMPALA-10850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525682#comment-17525682 ] 

ASF subversion and git services commented on IMPALA-10850:
----------------------------------------------------------

Commit e91c7810f088245e9c21d591f63c56781e261572 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e91c7810f ]

IMPALA-10850: Interpret timestamp predicates in local timezone in IcebergScanNode

IcebergScanNode interprets the timestamp literals as UTC timestamps
during predicate pushdown to Iceberg. It causes problems when the
Iceberg table uses TIMESTAMPTZ (which corresponds to TIMESTAMP WITH
LOCAL TIME ZONE in SQL) because in the scanners we assume that the
timestamp literals in a query are in local timezone.

Hence, if the Iceberg table is partitioned by HOUR(ts), and Impala is
running in a different timezone than UTC, then the following query
doesn't return any rows:

 SELECT * from t
 WHERE ts = <some ts>;

Because during predicate pushdown the timestamp is interpreted as a
UTC timestamp (no conversion from local to UTC), but during query
execution the timestamp data in the files are converted to local
timezone, then compared to <some ts>. I.e. in the scanner the
assumption is that <some ts> is in local timezone.

On the other hand, when Iceberg type TIMESTAMP (which correcponds
to TIMESTAMP WITHOUT TIME ZONE in SQL) is used, then we should just
push down the timestamp values without any conversion. In this case
there is no conversion in the scanners either.

Testing:
 * added e2e test with TIMESTAMPTZ
 * added e2e test with TIMESTAMP

Change-Id: I181be5d2fa004f69b457f69ff82dc2f9877f46fa
Reviewed-on: http://gerrit.cloudera.org:8080/18399
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>


> Interpret timestamp predicates in local timezone in IcebergScanNode
> -------------------------------------------------------------------
>
>                 Key: IMPALA-10850
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10850
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> IcebergScanNode interprets the timestamp literals as UTC timestamps:
> https://github.com/apache/impala/blob/b03d18863b31f0f3e66e9fa1f84cc9d625ecce29/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L197
> It can be confusing for users, probably we should interpret them based on the local timezone.
> We might need to update KuduScanNode as well:
> https://github.com/apache/impala/blob/b03d18863b31f0f3e66e9fa1f84cc9d625ecce29/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java#L559



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org