You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/03/24 19:17:00 UTC

[jira] [Commented] (IMPALA-9539) Enable the conjunctive normal form rewrites by default

    [ https://issues.apache.org/jira/browse/IMPALA-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066124#comment-17066124 ] 

ASF subversion and git services commented on IMPALA-9539:
---------------------------------------------------------

Commit 1411ca6a00d408956bd63d20995d13c3e6ded1b1 in impala's branch refs/heads/master from Aman Sinha
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1411ca6 ]

IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

Added an expression rewrite rule to convert a disjunctive predicate to
conjunctive normal form (CNF). Converting to CNF enables multi-table
predicates that were only evaluated by a Join operator to be converted
into either single-table conjuncts that are eligible for predicate pushdown
to the scan operator or other multi-table conjuncts that are eligible to
be pushed to a Join below. This helps improve performance for such queries.

Since converting to CNF expands the number of expressions, we place a
limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr)
that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is
exceeded, whatever expression was supplied to the rule is returned without
further transformation. A setting of -1 or 0 allows unlimited number of
CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES
enables or disables the entire rewrite. This is False by default until we
have done more thorough functional testing (tracking JIRA IMPALA-9539).

Examples of rewrites:
 original: (a AND b) OR c
 rewritten: (a OR c) AND (b OR c)

 original: (a AND b) OR (c AND d)
 rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

 original: NOT(a OR b)
 rewritten: NOT(a) AND NOT(b)

Testing:
 - Added new unit tests with variations of disjunctive predicates
   and verified their Explain plans
 - Manually tested the result correctness on impala shell by running
   these queries with ENABLE_CNF_REWRITES enabled and disabled
 - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled
 - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor
   shows almost 5x improvement:
      Original baseline: 47.5 sec
      With this patch and CNF rewrite enabled: 9.4 sec

Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Reviewed-on: http://gerrit.cloudera.org:8080/15462
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Enable the conjunctive normal form rewrites by default
> ------------------------------------------------------
>
>                 Key: IMPALA-9539
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9539
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>
> IMPALA-9183 adds the functionality to convert disjunctive predicates into conjunctive normal form (CNF).  Currently, the rewrite is disabled by default and  can be enabled by setting enable_cnf_rewrites to true.  Since it can cause several plan changes, we should ensure it does not cause regressions (both in terms of result correctness and performance) and then enable it by default.  This JIRA is a follow-up to IMPALA-9183. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org