You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/23 06:12:00 UTC

[jira] [Comment Edited] (IMPALA-10873) Push down IN-list predicate to ORC reader

    [ https://issues.apache.org/jira/browse/IMPALA-10873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402965#comment-17402965 ] 

Quanlong Huang edited comment on IMPALA-10873 at 8/23/21, 6:11 AM:
-------------------------------------------------------------------

Note that only EQUALS and IN-list predicates will be evaluated on ORC file's bloom filters:
https://github.com/apache/orc/blob/8e09664e53656544aac097421ad22bb0dadb391b/c++/src/sargs/PredicateLeaf.cc#L603
{code:cpp}
  static bool shouldEvaluateBloomFilter(PredicateLeaf::Operator op,
                                        TruthValue result,
                                        const BloomFilter * bloomFilter) {
    // evaluate bloom filter only when
    // 1) Bloom filter is available
    // 2) Min/Max evaluation yield YES or MAYBE
    // 3) Predicate is EQUALS or IN list
    // 4) Decimal type stores its string representation
    //    but has inconsistency in trailing zeros
{code}
IMPALA-6505 only pushes down non-equal binary predicates, e.g. <, <=, >, etc. Apart from the IN-list predicates, we can also push down EQUALS predicates for ORC to make good use of its bloom filters.


was (Author: stiga-huang):
Note that only EQUALS and IN-list predicates will be evaluated on ORC file's bloom filters:
{code:cpp}
  static bool shouldEvaluateBloomFilter(PredicateLeaf::Operator op,
                                        TruthValue result,
                                        const BloomFilter * bloomFilter) {
    // evaluate bloom filter only when
    // 1) Bloom filter is available
    // 2) Min/Max evaluation yield YES or MAYBE
    // 3) Predicate is EQUALS or IN list
    // 4) Decimal type stores its string representation
    //    but has inconsistency in trailing zeros
{code}
IMPALA-6505 only pushes down non-equal binary predicates, e.g. <, <=, >, etc. Apart from the IN-list predicates, we can also push down EQUALS predicates for ORC to make good use of its bloom filters.

> Push down IN-list predicate to ORC reader
> -----------------------------------------
>
>                 Key: IMPALA-10873
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10873
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Quanlong Huang
>            Priority: Major
>
> IMPALA-6505 pushs down the min-max predicates into the ORC reader. Since ORC's SearchArguments also support IN-list predicates, we can consider pushing down IN-list and not IN-list predicates into it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org