You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by tm...@apache.org on 2021/03/03 00:38:45 UTC

[impala] 03/06: IMPALA-10492: Lower default MAX_CNF_EXPRS query option

This is an automated email from the ASF dual-hosted git repository.

tmarshall pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 801efdcf097e82c79a077141c753d6d4d44d83e8
Author: Riza Suminto <ri...@cloudera.com>
AuthorDate: Fri Feb 26 07:39:17 2021 -0800

    IMPALA-10492: Lower default MAX_CNF_EXPRS query option
    
    MAX_CNF_EXPRS was set to unlimited by default. The CNF rewrite can lead
    to significant frontend memory usage and eventually OutOfMemory for a
    complex query that contain many predicates. We need to lower the default
    value to avoid this memory problem while maintaining performance for our
    TPC-DS and TPC-H workloads.
    
    We investigate the maximum number of CNF expressions in TPC-DS and TPC-H
    by printing out the final value of 'numCnfExprs_' from
    ConvertToCNFRule.java to the query profile. We found 5 queries that
    applies CNF rewrite rules as follow:
    
    | Query     | numCnfExprs_ |
    |-----------+--------------|
    | TPCDS-Q13 |          168 |
    | TPCDS-Q85 |          100 |
    | TPCDS-Q48 |           34 |
    | TPCH-Q19  |          124 |
    | TPCH-Q7   |            3 |
    
    This patch lower the default value from unlimited to 200 based on the
    result above.
    
    Testing:
    - Manually verify that MAX_CNF_EXPRS 200 is enough for our TPC-DS and
      TPC-H worloads.
    - Pass core tests.
    
    Change-Id: I7ca3d0e094ac01c24a046c25d6a1b56bf134faa8
    Reviewed-on: http://gerrit.cloudera.org:8080/17132
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 common/thrift/ImpalaInternalService.thrift | 2 +-
 common/thrift/ImpalaService.thrift         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/common/thrift/ImpalaInternalService.thrift b/common/thrift/ImpalaInternalService.thrift
index fb5e628..2983d12 100644
--- a/common/thrift/ImpalaInternalService.thrift
+++ b/common/thrift/ImpalaInternalService.thrift
@@ -417,7 +417,7 @@ struct TQueryOptions {
   100: optional bool enable_cnf_rewrites = true;
 
   // See comment in ImpalaService.thrift
-  101: optional i32 max_cnf_exprs = 0;
+  101: optional i32 max_cnf_exprs = 200;
 
   // See comment in ImpalaService.thrift
   102: optional i64 kudu_snapshot_read_timestamp_micros = 0;
diff --git a/common/thrift/ImpalaService.thrift b/common/thrift/ImpalaService.thrift
index 66a8335..0a48f80 100644
--- a/common/thrift/ImpalaService.thrift
+++ b/common/thrift/ImpalaService.thrift
@@ -514,7 +514,7 @@ enum TImpalaQueryOptions {
 
   // The max number of conjunctive normal form (CNF) exprs to create when converting
   // a disjunctive expression to CNF. Each AND counts as 1 expression. A value of
-  // -1 or 0 means no limit. Default is 0 (unlimited).
+  // -1 or 0 means no limit. Default is 200.
   MAX_CNF_EXPRS = 100
 
   // Set the timestamp for Kudu snapshot reads in Unix time micros. Only valid if