You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/03/05 23:54:00 UTC

[jira] [Commented] (IMPALA-10064) Support constant propagation for range predicates

    [ https://issues.apache.org/jira/browse/IMPALA-10064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696637#comment-17696637 ] 

ASF subversion and git services commented on IMPALA-10064:
----------------------------------------------------------

Commit 3573db68c83918126337c9267140b7fa36f153e4 in impala's branch refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3573db68c ]

IMPALA-11960: Fix constant propagation from TIMESTAMP to DATE

The constant propagation introduced in IMPALA-10064 handled conversion
of < and > predicates from timestamps to dates incorrectly.

Example:
select * from functional.alltypes_date_partition
  where date_col = cast(timestamp_col as date)
    and timestamp_col > '2009-01-01 01:00:00'
    and timestamp_col < '2009-02-01 01:00:00';

Before this change query rewrites added the following predicates:
date_col > DATE '2009-01-01' AND date_col < DATE '2009-02-01'
This incorrectly rejected all timestamps on the days of the
lower / upper bounds.

The fix is to rewrite < and > to <= and >= in the date predicates.

< could be kept if the upper bound is a constant with no time-of-day
part, e.g. timestamp_col < "2009-01-01" could be rewritten to
date_col < "2009-01-01", but this optimization is not added in this
patch to make it simpler.

Testing:
- added planner + EE regression tests

Change-Id: I1938bf5e91057b220daf8a1892940f674aac3d68
Reviewed-on: http://gerrit.cloudera.org:8080/19572
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Support constant propagation for range predicates
> -------------------------------------------------
>
>                 Key: IMPALA-10064
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10064
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.4.0
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>             Fix For: Impala 4.0.0
>
>
> Consider the following table schema, view and 2 queries on the view:
> {noformat}
> create table tt1 (a1 int, b1 int, ts timestamp) partitioned by (mydate date);
> create view tt1_view as (select a1, b1, ts from tt1 where mydate = cast(ts as date));
> // query 1:  (Good) constant on ts gets propagated
> explain select * from tt1_view where ts = '2019-07-01';
> 00:SCAN HDFS [db1.tt1]
>    partition predicates: mydate = DATE '2019-07-01'
>    HDFS partitions=1/3 files=2 size=48B
>    predicates: db1.tt1.ts = TIMESTAMP '2019-07-01 00:00:00'
>    row-size=24B cardinality=1
> // query 2: (Not good) constant on ts does not get propagated
> explain select * from tt1_view where ts > '2019-07-01';
> 00:SCAN HDFS [db1.tt1]
>    HDFS partitions=3/3 files=4 size=96B
>    predicates: db1.tt1.ts > TIMESTAMP '2019-07-01 00:00:00', mydate = CAST(ts AS DATE)
>    row-size=28B cardinality=1
> {noformat}
> Note that in query 1, with the equality condition on 'ts' the constant value is propagated to the 'mydate = CAST(ts as date)' predicate.  This gets applied as a partition predicate.  Whereas, in query 2 which has a range predicate, the constant is not propagated and no partition predicate is created for the scan.  We should support the second case also for constant propagation.  The constant predicates such as >, >=. <. <= and involving date or timestamp literals should be considered ..but we have to analyze the cases where the propagation is valid.  E.g with date_add, date_diff type of functions is there a potential for incorrect propagation.
> Note that a predicate can be a BETWEEN condition such as:
> {noformat}
> WHERE ts >= '2019-07-01' AND ts <= '2020--07-01'
> {noformat}
> In this case both need to be applied 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org