You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by cs...@apache.org on 2020/04/27 18:54:13 UTC
[impala] branch master updated (0eef60c -> 6208088)
This is an automated email from the ASF dual-hosted git repository.
csringhofer pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.
from 0eef60c IMPALA-9648: Exclude netty and netty-all from hadoop-hdfs mvn download
new f129a17 IMPALA-9648: Don't ban netty 3* from fe/pom.xml
new 6208088 IMPALA-9693: Analyze predicate in CNF rule if not previously done
The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
fe/pom.xml | 2 -
.../apache/impala/rewrite/ConvertToCNFRule.java | 3 +
.../queries/PlannerTest/convert-to-cnf.test | 67 ++++++++++++++++++----
3 files changed, 59 insertions(+), 13 deletions(-)
[impala] 02/02: IMPALA-9693: Analyze predicate in CNF rule if not
previously done
Posted by cs...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
csringhofer pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 62080883feffbad4ff945d541981e70207950be9
Author: Aman Sinha <am...@cloudera.com>
AuthorDate: Sun Apr 26 20:32:09 2020 -0700
IMPALA-9693: Analyze predicate in CNF rule if not previously done
The OrderByElement's expr that is used during the rewrite
phase was not analyzed, which causes an INVALID_TYPE
assert when the CNF rule tries to process the predicate
within the ORDER BY. This patch fixes the problem by
doing an explicit analyze of the compound predicate in
the CNF rule. This is a conservative approach such that
it can detect other such un-analyzed predicates that may
be passed in from any other clauses. An alternate attempt
at trying to replace the OrderByElement's expr with an
analyzed version works for this scenario but causes test
failures in ExprRewriterTest, so instead I have opted for
this approach.
Testing:
- Added tests with compound predicate in the ORDER BY
either in the main query block or within analytic function.
- Ran 'mvn test' for FE.
Change-Id: Iff71871bd69a068f4b5807161cffa7a49d76226d
Reviewed-on: http://gerrit.cloudera.org:8080/15815
Reviewed-by: Quanlong Huang <hu...@gmail.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
.../apache/impala/rewrite/ConvertToCNFRule.java | 3 +
.../queries/PlannerTest/convert-to-cnf.test | 67 ++++++++++++++++++----
2 files changed, 59 insertions(+), 11 deletions(-)
diff --git a/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java b/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
index b955a21..9b95f1a 100644
--- a/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
+++ b/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
@@ -109,6 +109,9 @@ public class ConvertToCNFRule implements ExprRewriteRule {
// check if this predicate references one or more tuples. If only 1 tuple,
// we can skip the rewrite since the disjunct can be pushed down as-is
List<TupleId> tids = new ArrayList<>();
+ if (!cpred.isAnalyzed()) {
+ cpred.analyzeNoThrow(analyzer);
+ }
cpred.getIds(tids, null);
if (tids.size() <= 1) {
return cpred;
diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test b/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
index 5fdc116..f643507 100644
--- a/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
+++ b/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
@@ -24,7 +24,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey > 10 OR l_suppkey > 30
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=600.12K
@@ -54,7 +54,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey > 10 OR l_suppkey > 30
row-size=24B cardinality=600.12K
====
@@ -84,7 +84,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey <= 30 OR l_suppkey >= 30 AND l_suppkey <= 50, l_suppkey >= 10 OR l_suppkey >= 30 AND l_suppkey <= 50
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=600.12K
@@ -118,7 +118,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=91
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey IN (10, 30, 50)
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=586
@@ -144,7 +144,7 @@ PLAN-ROOT SINK
| row-size=40B cardinality=600.12K
|
|--00:SCAN HDFS [tpch_parquet.lineitem]
-| HDFS partitions=1/1 files=3 size=193.99MB
+| HDFS partitions=1/1 files=3 size=193.98MB
| predicates: l_partkey > 0
| row-size=24B cardinality=600.12K
|
@@ -182,7 +182,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=600.12K
@@ -216,7 +216,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey <= 50 OR l_suppkey >= 30 AND l_suppkey <= 90, l_suppkey >= 10 OR l_suppkey >= 30 AND l_suppkey <= 90
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=600.12K
@@ -250,7 +250,7 @@ PLAN-ROOT SINK
| row-size=16B cardinality=150.00K
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
predicates: l_partkey > 0, l_suppkey <= 50 OR l_suppkey >= 30 AND l_suppkey <= 90, l_suppkey >= 10 OR l_suppkey >= 30 AND l_suppkey <= 90
runtime filters: RF000 -> l_orderkey
row-size=24B cardinality=600.12K
@@ -278,7 +278,7 @@ PLAN-ROOT SINK
| row-size=40B cardinality=600.12K
|
|--00:SCAN HDFS [tpch_parquet.lineitem]
-| HDFS partitions=1/1 files=3 size=193.99MB
+| HDFS partitions=1/1 files=3 size=193.98MB
| predicates: l_partkey > 0
| row-size=24B cardinality=600.12K
|
@@ -308,7 +308,7 @@ PLAN-ROOT SINK
| row-size=20B cardinality=5
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
row-size=8B cardinality=6.00M
====
# IMPALA-9620: query2
@@ -325,6 +325,51 @@ PLAN-ROOT SINK
| row-size=1B cardinality=2
|
00:SCAN HDFS [tpch_parquet.lineitem]
- HDFS partitions=1/1 files=3 size=193.99MB
+ HDFS partitions=1/1 files=3 size=193.98MB
+ row-size=8B cardinality=6.00M
+====
+# Test predicates in the SELECT and ORDER-BY
+select l_quantity,
+ if(l_quantity < 5 or l_quantity > 45, 'invalid', 'valid')
+ from lineitem
+ order by l_quantity,
+ if(l_quantity < 5 or l_quantity > 45, 'invalid', 'valid')
+ limit 5
+---- QUERYOPTIONS
+ENABLE_CNF_REWRITES=true
+---- PLAN
+PLAN-ROOT SINK
+|
+01:TOP-N [LIMIT=5]
+| order by: l_quantity ASC, if(l_quantity < 5 OR l_quantity > 45, 'invalid', 'valid') ASC
+| row-size=20B cardinality=5
+|
+00:SCAN HDFS [tpch_parquet.lineitem]
+ HDFS partitions=1/1 files=3 size=193.98MB
+ row-size=8B cardinality=6.00M
+====
+# Test predicate in the ORDER BY of an analytic function.
+select rank() over
+ (order by if(l_quantity < 5 or l_quantity > 45, 'invalid', 'valid'))
+ from tpch.lineitem
+ limit 5;
+---- QUERYOPTIONS
+ENABLE_CNF_REWRITES=true
+---- PLAN
+PLAN-ROOT SINK
+|
+02:ANALYTIC
+| functions: rank()
+| order by: if(l_quantity < 5 OR l_quantity > 45, 'invalid', 'valid') ASC
+| window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+| limit: 5
+| row-size=28B cardinality=5
+|
+01:SORT
+| order by: if(l_quantity < 5 OR l_quantity > 45, 'invalid', 'valid') ASC
+| row-size=20B cardinality=6.00M
+|
+00:SCAN HDFS [tpch.lineitem]
+ HDFS partitions=1/1 files=1 size=718.94MB
row-size=8B cardinality=6.00M
====
[impala] 01/02: IMPALA-9648: Don't ban netty 3* from fe/pom.xml
Posted by cs...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
csringhofer pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit f129a179a2c1b304e4d15fe4950449c5786abda1
Author: David Knupp <dk...@cloudera.com>
AuthorDate: Mon Apr 27 10:26:16 2020 -0700
IMPALA-9648: Don't ban netty 3* from fe/pom.xml
netty-all < 4.14.46 and netty < 3.10.6 were banned in any earlier patch,
but we have found that in some environments, netty 3.10.5 is still present,
so this can cause build failures. The ban on netty-all is not an issue.
While we sort what needs to be done with regard to netty 3.10.5, we'll
temporarily remove the ban. This change may become permanent based on
further investigation.
Change-Id: Ib1a55f22f1925872c0d19aaf0670404203dcca54
Reviewed-on: http://gerrit.cloudera.org:8080/15819
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: David Knupp <dk...@cloudera.com>
---
fe/pom.xml | 2 --
1 file changed, 2 deletions(-)
diff --git a/fe/pom.xml b/fe/pom.xml
index 70d17d6..59d6eeb 100644
--- a/fe/pom.xml
+++ b/fe/pom.xml
@@ -758,8 +758,6 @@ under the License.
<exclude>org.fusesource.leveldbjni:*</exclude>
<!-- IMPALA-9647 (re: CVE-2014-3577, CVE-2015-5262) -->
<exclude>org.apache.httpcomponents:fluent-hc</exclude>
- <!-- IMPALA-9648: Ensure that netty < v3.10.6 is not present. -->
- <exclude>io.netty:netty:[3.10.6,)</exclude>
<!-- IMPALA-9648: Ensure that netty-all < v4.1.46 is not present. -->
<exclude>io.netty:netty-all:[4.1.46,)</exclude>
<!-- Assert that we only use artifacts from only the specified