You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/09/16 01:30:00 UTC

[jira] [Created] (IMPALA-10916) Pushdown Stats predicates with different argument types to the ORC reader

Quanlong Huang created IMPALA-10916:
---------------------------------------

             Summary: Pushdown Stats predicates with different argument types to the ORC reader
                 Key: IMPALA-10916
                 URL: https://issues.apache.org/jira/browse/IMPALA-10916
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


IMPALA-6505 introduces min-max predicate (stats predicate) pushdown to the ORC reader. However, stats predicates with different argument types may not be pushed down. For example, for the query
{code:sql}
select count(*) from functional_orc_def.decimal_tbl where d1 > 132842;
{code}
the slot type of "d1" is DECIMAL(9,0), while the type of literal "132842" is INT.
 FE generates a stats predicate "d1 > 132842". After analyze, it becomes "CAST(d1 as INT) > CAST(132842 as INT)". The rhs is Literal(value=132842 type=INT) in the BE, which is good. But the lhs is no longer a single slot ref, but a expression. So we cannot simply push it down.

The following queries have the same issue:
{code:sql}
select count(*) from functional_orc_def.decimal_tbl where d1 > cast(132842.0 as float);
select count(*) from functional_orc_def.decimal_tbl where d1 > cast(132842.0 as double);
{code}
Currently, a workaround is avoid comparing slot ref with other types. For decimal, this query works:
{code:sql}
select count(*) from functional_orc_def.decimal_tbl where d1 > 132842.0;
{code}
FE will analyze "132842.0" to be DECIMAL(10,1), so we are good.

A possible solution is detecting such simple CAST expressions, and generating orc::PredicateDataType using the slot type instead of the literal type. We will need handcraft casting logics when generating the orc::Literal, e.g. converting a impala::Literal(value=132842.0 type=FLOAT) to orc::Literal in DECIMAL type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org