You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Qifan Chen (Code Review)" <ge...@cloudera.org> on 2020/12/01 16:54:46 UTC
[Impala-ASF-CR] IMPALA-10360: Allow simple limit to be treated as sampling hint
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16792 )
Change subject: IMPALA-10360: Allow simple limit to be treated as sampling hint
......................................................................
Patch Set 2:
(6 comments)
Looks good to me!
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java:
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@79
PS2, Line 79: if (Expr.IS_ALWAYS_TRUE_PREDICATE.apply(e1) &&
: Expr.IS_ALWAYS_TRUE_PREDICATE.apply(e2)) {
: setHasAlwaysTrueHint(true);
Maybe better case on the op here. For OR, when one branch is always true, we can set setHasAlwaysTrueHint(true).
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@520
PS2, Line 520: if (conjuncts.size() > 1) {
Seems we should do so even when for a single predicate case.
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@223
PS2, Line 223: if (getTableRefs().size() == 1)
It seems to me that we should check hasConvertLimitToSampleHint() on a single table case.
SELECT * FROM T WHERE ...
vs.
SELECT * FROM T [convert_limit_to_sample] WHERE ...
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java:
http://gerrit.cloudera.org:8080/#/c/16792/2/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@209
PS2, Line 209: estimatedTotalRows
When the #rows are indeed estimated due to missing or corrupt stats, it may be useful to inflate the sampling rate, say by 3 times to reduce the chance of under-sampling, or to discard the table for consideration completely.
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L1219
http://gerrit.cloudera.org:8080/#/c/16792/2/testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test
File testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test:
http://gerrit.cloudera.org:8080/#/c/16792/2/testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test@259
PS2, Line 259: # Query against a view that has WHERE clause hint.
: # Both the num partitions and files are pruned.
: select * from functional.alltypes_dp_2_view_1 limit 10;
: ---- PLAN
Nice test queries!
http://gerrit.cloudera.org:8080/#/c/16792/2/testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test@275
PS2, Line 275: select * from functional.alltypes_dp_2_view_2 limit 10;
May consider add one or two queries against tables directly (i.e., no views).
--
To view, visit http://gerrit.cloudera.org:8080/16792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife05a5343c913006f7659949b327b63d3f10c04b
Gerrit-Change-Number: 16792
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 01 Dec 2020 16:54:46 +0000
Gerrit-HasComments: Yes