You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "wangsheng (Code Review)" <ge...@cloudera.org> on 2021/11/12 16:37:28 UTC

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

wangsheng has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18023


Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only use simple estimation to compute selectivity
for some predicates, other predicates, and this maybe lead to worse
query plan due to CBO. Hence, we add new hints to set these stats
manually in query to help us get better CBO. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  select col from t /* +HDFS_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows. But this
hint value only valid when table does not have stats or stats is corrupt.
Otherwise, Impala will use table original stats.

For 'SELECTIVITY' hint, we can use in these predicates:
* BinaryPredicate
* InPredicate
* IsNullPredicate
* LikePredicate, including 'not like' syntax
* BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These format
are not allowed:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need ensure that each child selectivity is been set by hint or
computable. Otherwise, this hint maybe does not take effect as you
expected.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
13 files changed, 1,445 insertions(+), 18 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 1
Gerrit-Owner: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only use simple estimation to compute selectivity
for some predicates, other predicates, and this maybe lead to worse
query plan due to CBO. Hence, we add new hints to set these stats
manually in query to help us get better CBO. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  select col from t /* +HDFS_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows. But this
hint value only valid when table does not have stats or stats is
corrupt. Otherwise, Impala will use table original stats.

For 'SELECTIVITY' hint, we can use in these predicates:
* BinaryPredicate
* InPredicate
* IsNullPredicate
* LikePredicate, including 'not like' syntax
* BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These format
are not allowed:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need ensure that each child selectivity is been set by hint or
computable. Otherwise, this hint maybe does not take effect as you
expected.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
13 files changed, 1,475 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 2
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java@1068
PS23, Line 1068:         && !((CompoundPredicate) this).selectivityValidHintSet()) {
> Why? This patch only add a selectivity hint check in original if-condition,
THis is a generic function to collect conjuncts. The definition should not change if a predicate hint is set. If you want to parameterize this function to behave differently, I suggest that you pass a flag in so that the semantics are clear from all of the callers.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@70
PS23, Line 70:     result.setSelectivityHint(bp.getSelectivity());
> This rule is to transform BetweenPredicate to CompoundPredicate, which is t
What if the compound predicate is not a between predicate? Checks here seem too broad to isolate that case.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 30 Mar 2023 20:01:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 26:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12724/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 26
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 31 Mar 2023 08:09:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 30:

Finally, I removed modification in 'Expr', and this patch does not supported selectivity hint for 'AND' compound predicates, maybe we can implement this in other patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 30
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 12 Apr 2023 16:00:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 30:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12784/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 30
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 12 Apr 2023 16:19:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 34:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12818/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 19 Apr 2023 13:23:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 34: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 20 Apr 2023 10:13:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24:   * InPredicate
> I meant to say in primitive form.
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> Thanks a lot for the research. It is great. 
Thanks for you deep research, Qifan. As you mentioned above:
1. For predicate 'int_col > 1 and smallint_col > 3', conjuncts are 'int_col > 1' and 'smallint_col > 3', which are two BinaryPredicate;  
2. For predicate 'int_col > 1 or smallint_col > 3', conjuncts is 'int_col > 1 or smallint_col > 3', which is a CompoundPredicate;  
If we only handle a subset of complex expressions, this will confused users. When can we use selectivity hints for a compound predicates? I think it is difficult for users to judge which situation selectivity is valid, the learning cost is very high.  
Besides, if we compute each child predicates' selectivity separately, we can reuse these in different queries, for example 'int_col=1 and smallint_col=2' and 'int_col=1 and bool_col=0', since we believe that hot predicates always appear in many queries.  
If set selectivity for a compound predicate is useful in your environment, we can consider this carefully in a sub-task as I said. I think handle a subset of complex expressions is not very suitable.  

I listed three cases in commit message which are not supported:  
  * select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  * select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  * select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51
PS3, Line 51: value for two 'BinaryPredicate' children.
            : 
            : Testing:
            : - Added new fe tests in 'PlannerTest'
> See my comment on complex predicates above.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 15 Dec 2021 07:50:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

(6 comments)

Looks good to me too. I have only reviewed the code section and will continue tomorrow to cover the tests. 

Thanks a lot for the rework, Wang Sheng!

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java@78
PS9, Line 78: is
nit. should be a double value in (0, 1].


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@554
PS9, Line 554:         // This hint is only valid for hdfs table
nit. for this HDFS table reference. On paper, one can assign different # rows for distinct table refs referencing the same table. For example,

select * from T <+ TABLE_NUM_ROWS = 1000 >, T <+ TABLE_NUM_ROWS = 10 >;


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@555
PS9, Line 555: valueOf
It seems we need to handle NumberFormatException thrown from this method. For example, TABLE_NUM_ROWS = 'abc'.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@81
PS9, Line 81: Reserve
nit. Store


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@81
PS9, Line 81: ,
nit. "."


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@82
PS9, Line 82: if table does not have stats, or corrupt stats. Otherwise, Impala will
            :   // use table original stats.
I wonder if this is still correct. I thought the hint can be used to override any stats, valid or invalid.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 06 Sep 2022 21:39:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12756/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 06 Apr 2023 08:09:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> Sorry I didn't understand what you mean, can you show me the detail modific
Consider Expr.isTriviallyTrue() for example in the same file. That currently considers all of the Exprs returned by getConjuncts. Now if you have a hint, it is going to not consider the child conjuncts. I don't know if that will be be correct still. If nothing else it is hard to follow and a risk for future bugs. There are multiple other caller in analyzer that I did not look at. My suggestion would be to not change the existing getConjuncts() or any logic that is calls. Instead, make a new function getLocalConjuncts and conditionally call that from where you need to after determining that the conditional calls will not adversely affect correctness or other optimizations.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 10 Apr 2023 14:09:23 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1067
PS29, Line 1067: // Selectivity hint cannot be set. If Predicate been set selectivity hint, we return
               :       // itself directly, such as CompoundPredicate. Otherwise, hint value will missing.
> Hi Qifan. What do you mean 'Note a Predicate contains the selectivityHint_ 
Option 1) probably will exclude lots of common predicates from utilizing this useful feature. But without some study, we may not get the right answer. So my vote for this patch is 1). 

Option 2) sounds like a good starting point to address 1).  But we need to find out how the selectivity at AND predicate is computed from child(0) and child(1). From that we may be able to back fit the selectivity hint from the AND predicate down. This will be for the general cases where  SELECT_HINT(AND) != -1.

For the special case:  SELECT_HINT(AND) = -1, then just store child(0) and child(1) in the list without changing their selectivity hint (even being -1) at all.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 11 Apr 2023 20:03:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 26:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java@1068
PS23, Line 1068:       // and conjunct evaluation.  This is not optimal for jitted exprs because it
> THis is a generic function to collect conjuncts. The definition should not 
I extract a method 'allowConjunctsFromChild' here.

We add a check for selectivity hint here is due to CompoundPredicate. If we don't check here, selectivity hint for AND CompoundPredicate will always missing, since conjuncts will be it's children, instead of itself.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@70
PS23, Line 70:     result.setSelectivityHint(bp.getSelectivity());
> What if the compound predicate is not a between predicate? Checks here seem
In my opinion, this rule is only used to transform BetweenPredicate to CompoundPredicate, so the input predicate must be BetweenPredicate. If current predicate is not BetweenPredicate, this rule will return directly in line 41.
So I think set new created CompoundPredicate's selectivity hint value from BetweenPredicate seems reasonable.

I don't understand 'What if the compound predicate is not a between predicate', can you explain this in more detail?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 26
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 31 Mar 2023 07:49:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY' to help specify a
selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number, in the range of (0, 1], to
use as the selectivity for the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 or b=2) /* +SELECTIVITY(0.5) */;

As a limitation of this path, the selectivity hints for 'AND' compound
predicates, either in the original SQL query or internally generated,
are ignored. We may supported this in the near future.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Reviewed-on: http://gerrit.cloudera.org:8080/18023
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Qifan Chen <qf...@hotmail.com>
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
10 files changed, 414 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Qifan Chen: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 36
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/28/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/28/fe/src/main/java/org/apache/impala/analysis/Expr.java@1065
PS28, Line 1065:       list.addAll(getLocalConjuncts());
> This still has not addressed the concern that getConjuncts() which
 > is a generic method for collecting conjuncts and the behavior
 > should not change based on a hint. Probably best to add a new
 > function that returns only the local conjuncts and call that with
 > the appropriate hint checking logic inline in the caller so the
 > context and intent is all clear.

I try to modify the code, I'm not sure if this modification is what you mentioned above.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 06 Apr 2023 07:49:00 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 14:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12514/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 14
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 02 Mar 2023 13:18:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 7:

Code modification to facilitate above general version of grammar to handle predicates and selectivity hints.

Since we put the hint at the class Predicate, it looks like we can get the hint for any simple or complex predicate at the right place. 

[15:00:23 qchen@qifan-10229: Impala.07202021] git diff
diff --git a/fe/src/main/cup/sql-parser.cup b/fe/src/main/cup/sql-parser.cup
index bbd1e53d0..bef0b3c1d 100644
--- a/fe/src/main/cup/sql-parser.cup
+++ b/fe/src/main/cup/sql-parser.cup
@@ -307,7 +307,7 @@ terminal
   KW_RANGE, KW_RCFILE, KW_RECOVER, KW_REFERENCES, KW_REFRESH, KW_REGEXP, KW_RELY,
   KW_RENAME, KW_REPEATABLE, KW_REPLACE, KW_REPLICATION, KW_RESTRICT, KW_RETURNS,
   KW_REVOKE, KW_RIGHT, KW_RLIKE, KW_ROLE, KW_ROLES, KW_ROLLUP, KW_ROW, KW_ROWS, KW_SCHEMA,
-  KW_SCHEMAS, KW_SELECT, KW_SEMI, KW_SEQUENCEFILE, KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
+  KW_SCHEMAS, KW_SELECT,  KW_SELECTIVITY, KW_SEMI, KW_SEQUENCEFILE, KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
   KW_SET, KW_SHOW, KW_SMALLINT, KW_SETS, KW_SORT, KW_SPEC, KW_STORED, KW_STRAIGHT_JOIN,
   KW_STRING, KW_STRUCT, KW_SYMBOL, KW_SYSTEM_TIME, KW_SYSTEM_VERSION,
   KW_TABLE, KW_TABLES, KW_TABLESAMPLE, KW_TBLPROPERTIES,
@@ -431,7 +431,7 @@ nonterminal TimeTravelSpec opt_asof;
 nonterminal Subquery subquery;
 nonterminal JoinOperator join_operator;
 nonterminal opt_inner, opt_outer;
-nonterminal PlanHint plan_hint;
+nonterminal PlanHint plan_hint, selectivity_hint;
 nonterminal List<PlanHint> plan_hints, opt_plan_hints, plan_hint_list;
 nonterminal TypeDef type_def;
 nonterminal Type type;
@@ -629,6 +629,8 @@ precedence left KW_INTO;
 
 precedence left KW_OVER;
 
+precedence left COMMENTED_PLAN_HINT_START;
+
 start with stmt;
 
 stmt ::=
@@ -3720,6 +3722,15 @@ function_params ::=
   {: RESULT = new FunctionParams(false, true, exprs); :}
   ;
 
+selectivity_hint ::=
+  COMMENTED_PLAN_HINT_START KW_SELECTIVITY LPAREN DECIMAL_LITERAL:value RPAREN
+      COMMENTED_PLAN_HINT_END
+  {:
+    RESULT = new PlanHint("selectivity",
+      new ArrayList(Arrays.asList(value.toString())));
+  :}
+  ;
+
 predicate ::=
   expr:e KW_IS KW_NULL
   {: RESULT = new IsNullPredicate(e, false); :}
@@ -3746,6 +3757,8 @@ predicate ::=
   :}
   | expr:e1 KW_LOGICAL_OR expr:e2
   {: RESULT = new CompoundVerticalBarExpr(e1, e2); :}
+  | predicate:p selectivity_hint:h
+  {: RESULT = p; :}
   ;
 
 comparison_predicate ::=
diff --git a/fe/src/main/jflex/sql-scanner.flex b/fe/src/main/jflex/sql-scanner.flex
index e60676a9c..bc3a91513 100644
--- a/fe/src/main/jflex/sql-scanner.flex
+++ b/fe/src/main/jflex/sql-scanner.flex
@@ -236,6 +236,7 @@ import org.apache.impala.thrift.TReservedWordsVersion;
     keywordMap.put("schemas", SqlParserSymbols.KW_SCHEMAS);
     keywordMap.put("select", SqlParserSymbols.KW_SELECT);
     keywordMap.put("semi", SqlParserSymbols.KW_SEMI);
+    keywordMap.put("selectivity", SqlParserSymbols.KW_SELECTIVITY);
     keywordMap.put("sequencefile", SqlParserSymbols.KW_SEQUENCEFILE);
     keywordMap.put("serdeproperties", SqlParserSymbols.KW_SERDEPROPERTIES);
     keywordMap.put("serialize_fn", SqlParserSymbols.KW_SERIALIZE_FN);


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 17 Dec 2021 20:03:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

> (7 comments)
 > 
 > This patch looks pretty good for me! I hope we can reduce the test
 > files by not showing the verbose plan. It's a little large for me
 > to go through all the test cases.

Hi Quanlong, thanks for review. I split this patch as two parts. This patch will focus on selectivity hint, and I will move cardinality hint in to another new pr. Please refer to: https://gerrit.cloudera.org/#/c/18829.
I will adjust this patch later.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 10 Aug 2022 08:03:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 8:

"Besides, we can not modify like this directly yet:
| predicate:p selectivity_hint:h
{:
  if (hint != null) {
    p.setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
  }
  RESULT = p;
:}
Since 'predicate' is an 'Expr', instead of 'Predicate'. 'setSelectivityHint' method is belong to 'Predicate'. Here is import statement:
nonterminal Expr predicate, bool_test_expr;"

I think we should verify p to be a Predicate first.

If there exists a hint {
  If (p instanceof Predicate) {  ((Predicate)p).setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
  } else {
    // throw an error
  }
}


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 22 Mar 2022 13:01:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 14: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 14
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 06 Mar 2023 13:37:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 03 Apr 2023 13:00:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12734/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 03 Apr 2023 07:59:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 17:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12597/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 17
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 09 Mar 2023 12:38:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 16:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12596/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 16
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 09 Mar 2023 09:57:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 21:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@69
PS19, Line 69: 
> I think selectivity hint is valid for Predicate, instead of Expr, so I impl
I see. This makes sense to me. Thanks for the explanation!


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/analysis/Predicate.java@171
PS21, Line 171:     this.selectivityHint_ = selectivityHint;
Should we update hasValidSelectivityHint_ as well? Or is it enough to only set hasValidSelectivityHint_ in constructors and analyzeSelectivityHint()?


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@68
PS21, Line 68:     result.setSelectivityHint(bp.getSelectivity());
Could you add a comment for why we always set the selectivity hint regradless whether the selectivity of 'bp' comes from hints? Maybe this is always better than what the planner will estimate?


http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@130
PS21, Line 130: where (l_shipdate BETWEEN '1998-09-01' AND '1998-09-03')/* +SELECTIVITY(0.5)*/
missing the SELECT part here, which causes the test failure



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 21
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 00:49:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 4:

(30 comments)

Hi guys, thanks for your carefully review. I've already modify code as much as possible. If I missed any thing or you have any suggestions, please tell me. Hope for your reply again.

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@9
PS3, Line 9: use
> nit uses
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10
PS3, Line 10: orse query plan
> nit: this may lead
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10
PS3, Line 10: , and this may lea
> nit. remove?
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@11
PS3, Line 11: be in the future,
            : we can use histograms to get more precise qu
> nit. to reduce such errors.
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@20
PS3, Line 20: 
> nit remove
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@21
PS3, Line 21: 
> nit is valid
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@22
PS3, Line 22: For 'SELECTIVITY' hint, we can use in these predicates:
> I think we should raise a warning when the hint is not used.
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: * InPredicate
> nit. types of
Sorry, I don't understand, can you explain this?


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: 
> in non-compounding form.
Sorry, I don't understand, can you explain this?


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@34
PS3, Line 34: 
> nit: formats
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@45
PS3, Line 45: ic
> isn't second 0.1?
I mean, these predicates without selectivity computing, default value is -1. E.g 'a>1 and b>2', these two predicates' selectivity value are both '-1'. When compute combined selectivity in PlanNode.computeCombinedSelectivity(), Impala will use only one 0.1 for all these predicates, which selectivity value are -1, which mean no selectiviy.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@48
PS3, Line 48: Another thi
> nit: need to ensure
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@48
PS3, Line 48: mpala will 
> nit: has been set / is set
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> I wonder why we can not assign a selectivity to a complex predicate in this
Thanks for advice, Qifan. I also consider about this, but here are some problems.  
1. When we set a selectivity for compound, such as (a>1 and b>2) /* +selectivity(0.5)*/. Impala will execute PlanNode.computeCombinedSelectivity(), the 'conjuncts_' is a list with [(a>1),(b>2)], not [(a>1 and b>2)], the two children selectivity are -1, then we will lost the original  selectivity 0.5. If we modifiy computeCombinedSelectivity() and related code, it will be complex.
2. Use selectivity hint for predicate like (xxx) will caused 'shift/reduce conflicts', becasue both predicate and hint contains brackets, maybe we need to modify the syntax, I'm not sure.
3. In my opinion, we usually compute selectivity for a single predicate like a=1, b IS NULL and so on. Set selectivity for compound predicate, espically for multiple nested predicates, it maybe confused.
But if you insist on this, we can create a sub-task after this feature, and discuss about this. How do you think?


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> nit: might not take the expected effect
Done


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51
PS3, Line 51: value for two 'BinaryPredicate' children.
            : 
            : Testing:
            : - Added new fe tests in 'PlannerTest'
> I wonder if this is correct. 
I understand, this computing maybe not perfect. But if we set selectivity for compound predicate. In 'BetweenToCompoundRule', Impala will transform a 'Between' to 'Compound'.
E.g a 'Between 1 and 2', to 'a>=1 and a<=2', if we set selectivity for whole 'a>=1 and a<=2', when execute 'computeCombinedSelectivity', the conjuncts is 'a>=1' and 'a<=2', both these two children selectivity are -1, and we lost the original selectivity as mentioned above.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3260
PS3, Line 3260: computed 
> nit: computed
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3829
PS3, Line 3829: 
> nit: including
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3831
PS3, Line 3831: // Not like predic
> Could you please add a test for not like predicate predicate-selectivity-hi
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@248
PS3, Line 248: selectivity_ > 0
> nit. This prevents a 0 selectivity. Maybe use -2 to indicate non-computable
I disable 0 for this value, since 0 for predicate selectivity hint make no sense.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java
File fe/src/main/java/org/apache/impala/analysis/InPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java@177
PS3, Line 177: elect
> same comment as for BinaryPredicate.java
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
File fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java@143
PS3, Line 143: electivity
> same comment as for BinaryPredicate.java
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/Predicate.java@77
PS3, Line 77:         analyzer.addWarning("Selectivity hint is allowed for single column predicate: " +
> nit: maybe add more information about this predicate in the warning.
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@175
PS3, Line 175: tableNumRowsHint
> I wonder we can use a more generic name numRowsHint_ as the hint can be app
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@543
PS3, Line 543: st<String> args = hint.getArgs();
> nit. This comment probably should be moved to line 547.
Done


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1556
PS1, Line 1556:     }
> I don't have a strong opinion here, just want to mention that table stats c
I see, you are right. Table stats can be stale, I modify the code. Now table rows hint will replace the original stats.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@331
PS3, Line 331:   * 'partitions' being the partition
> I think we probably should move it to ScanNode with the name numRowsHints_.
Done


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5008
PS3, Line 5008: AnalyzesOk
> Why is it not an analysis error?
This is refer to other hints, in AnalyzeStmtsTest.TestTableHints(). I think this make sense, illegal hints should not caused exception, maybe a warning is enough. How do you think?


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5050
PS3, Line 5050: (0)
> Don't we allow zero as selectivity hint? E.g. at L5060 we say 'allowed valu
Yes, maybe not. Zero value make no sense for a query. I will modify that comment.


http://gerrit.cloudera.org:8080/#/c/18023/3/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/3/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@643
PS3, Line 643: 4.28M
> Above we have 2.42M for 0.5 selectivity.
Yes, you are right, in above cases, we have 2.42M for 0.5 selectivity. But in 'BetweenPredicate', we can not compute this card directly. In 'BetweenToCompoundRule', we need to split 0.5 in two children, You can refer the comments in'BetweenToCompoundRule'. This split caused deviation. 
For example, if 0.04 for 'Between', each child is 0.2.
I know this computing maybe not perfect, but we must set selectivity for each child. Since when Impala execute 'computeCombinedSelectivity', conjuncts are two children, we cannot set selectivity for whole compound predicate.
Do you have any suggestions about this? Hope for your reply.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 11 Dec 2021 09:03:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9912/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sun, 12 Dec 2021 10:11:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 8:

(1 comment)

> Code modification to facilitate above general version of grammar to
 > handle predicates and selectivity hints.
 > 
 > Since we put the hint at the class Predicate, it looks like we can
 > get the hint for any simple or complex predicate at the right
 > place.
 > 
 > [15:00:23 qchen@qifan-10229: Impala.07202021] git diff
 > diff --git a/fe/src/main/cup/sql-parser.cup b/fe/src/main/cup/sql-parser.cup
 > index bbd1e53d0..bef0b3c1d 100644
 > --- a/fe/src/main/cup/sql-parser.cup
 > +++ b/fe/src/main/cup/sql-parser.cup
 > @@ -307,7 +307,7 @@ terminal
 > KW_RANGE, KW_RCFILE, KW_RECOVER, KW_REFERENCES, KW_REFRESH,
 > KW_REGEXP, KW_RELY,
 > KW_RENAME, KW_REPEATABLE, KW_REPLACE, KW_REPLICATION, KW_RESTRICT,
 > KW_RETURNS,
 > KW_REVOKE, KW_RIGHT, KW_RLIKE, KW_ROLE, KW_ROLES, KW_ROLLUP,
 > KW_ROW, KW_ROWS, KW_SCHEMA,
 > -  KW_SCHEMAS, KW_SELECT, KW_SEMI, KW_SEQUENCEFILE,
 > KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
 > +  KW_SCHEMAS, KW_SELECT,  KW_SELECTIVITY, KW_SEMI,
 > KW_SEQUENCEFILE, KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
 > KW_SET, KW_SHOW, KW_SMALLINT, KW_SETS, KW_SORT, KW_SPEC, KW_STORED,
 > KW_STRAIGHT_JOIN,
 > KW_STRING, KW_STRUCT, KW_SYMBOL, KW_SYSTEM_TIME, KW_SYSTEM_VERSION,
 > KW_TABLE, KW_TABLES, KW_TABLESAMPLE, KW_TBLPROPERTIES,
 > @@ -431,7 +431,7 @@ nonterminal TimeTravelSpec opt_asof;
 > nonterminal Subquery subquery;
 > nonterminal JoinOperator join_operator;
 > nonterminal opt_inner, opt_outer;
 > -nonterminal PlanHint plan_hint;
 > +nonterminal PlanHint plan_hint, selectivity_hint;
 > nonterminal List<PlanHint> plan_hints, opt_plan_hints,
 > plan_hint_list;
 > nonterminal TypeDef type_def;
 > nonterminal Type type;
 > @@ -629,6 +629,8 @@ precedence left KW_INTO;
 > 
 > precedence left KW_OVER;
 > 
 > +precedence left COMMENTED_PLAN_HINT_START;
 > +
 > start with stmt;
 > 
 > stmt ::=
 > @@ -3720,6 +3722,15 @@ function_params ::=
 > {: RESULT = new FunctionParams(false, true, exprs); :}
 > ;
 > 
 > +selectivity_hint ::=
 > +  COMMENTED_PLAN_HINT_START KW_SELECTIVITY LPAREN
 > DECIMAL_LITERAL:value RPAREN
 > +      COMMENTED_PLAN_HINT_END
 > +  {:
 > +    RESULT = new PlanHint("selectivity",
 > +      new ArrayList(Arrays.asList(value.toString())));
 > +  :}
 > +  ;
 > +
 > predicate ::=
 > expr:e KW_IS KW_NULL
 > {: RESULT = new IsNullPredicate(e, false); :}
 > @@ -3746,6 +3757,8 @@ predicate ::=
 > :}
 > | expr:e1 KW_LOGICAL_OR expr:e2
 > {: RESULT = new CompoundVerticalBarExpr(e1, e2); :}
 > +  | predicate:p selectivity_hint:h
 > +  {: RESULT = p; :}
 > ;
 > 
 > comparison_predicate ::=
 > diff --git a/fe/src/main/jflex/sql-scanner.flex b/fe/src/main/jflex/sql-scanner.flex
 > index e60676a9c..bc3a91513 100644
 > --- a/fe/src/main/jflex/sql-scanner.flex
 > +++ b/fe/src/main/jflex/sql-scanner.flex
 > @@ -236,6 +236,7 @@ import org.apache.impala.thrift.TReservedWordsVersion;
 > keywordMap.put("schemas", SqlParserSymbols.KW_SCHEMAS);
 > keywordMap.put("select", SqlParserSymbols.KW_SELECT);
 > keywordMap.put("semi", SqlParserSymbols.KW_SEMI);
 > +    keywordMap.put("selectivity", SqlParserSymbols.KW_SELECTIVITY);
 > keywordMap.put("sequencefile", SqlParserSymbols.KW_SEQUENCEFILE);
 > keywordMap.put("serdeproperties", SqlParserSymbols.KW_SERDEPROPERTIES);
 > keywordMap.put("serialize_fn", SqlParserSymbols.KW_SERIALIZE_FN);

Sorry for the delay, thanks for testing so carefully! From you modification above, I found some problems. You didn't assgin selectivity value  to 'Predicate' in sql-parse.cup like this:
if (hint != null) {
  p.setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
}
So even you can execute 'explain select * from functional.alltypestiny  where int_col = 1 /* +SELECTIVITY(0.5) */' successfully, but '0.5' do not passed to 'BinaryPredicate', which means this selectivity is invalid. If you set to 0.1/0.4/0.8, the cardinality is always 4. I already try your code in my env.
Besides, we can not modify like this directly yet:
| predicate:p selectivity_hint:h
{:
  if (hint != null) {
    p.setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
  }
  RESULT = p;
:}
Since 'predicate' is an 'Expr', instead of 'Predicate'. 'setSelectivityHint' method is belong to 'Predicate'. Here is import statement:
nonterminal Expr predicate, bool_test_expr;
And this is why I define a 'predicate_with_hint' in this patch.

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> "If we only handle a subset of complex expressions, this will confused user
Hi Qifan,
"Not sure the argument holds :-) as currently the patch allows selectivity hints to be applied to a subset of predicates anyway."  
About this, here is my opinion: this patch indeed 'allows selectivity hints to be applied to a subset of predicates anyway', but if we use wrong invalid hint, query will throw exception or print a warning wsg. If we set a hint to 'a=1 and b=2', the original hint will missed due to conjunct without any tips, this is what I'm worried.  

"If there is a hint for the whole predicate, can we wrap the whole thing in CompoundPredicate()". 
For this idea, you mean rewrite exprs refer to a compound predicate hint? I'm not sure this is suitable, but maybe too complex for this patch?  

Besides, I'm not sure, if we can implement '(xxx) /* SELECTIVITY(0.1) */' in sql-parser. To be honest, I'm didn't take a deep research on java cup...



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 19 Mar 2022 07:30:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 27:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9188/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 27
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 31 Mar 2023 07:55:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 27: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 27
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 31 Mar 2023 13:02:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 04 Apr 2023 14:10:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#30). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY', we can use this
hint to specify a selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number to use as the selectivity for
the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 or b=2) /* +SELECTIVITY(0.5) */;

But pay attention, selectivity hint is invalid for 'AND' compound
predicates and between predicates in this patch. We may supported
this in the near future.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
10 files changed, 418 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/30
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 30
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1067
PS29, Line 1067: // Selectivity hint cannot be set. If Predicate been set selectivity hint, we return
               :       // itself directly, such as CompoundPredicate. Otherwise, hint value will missing.
This comment is not accurate and should  be removed. The original comment from Base version should be restored here. 

It looks to me the original form of getConjuncts() should work fine even when selectivity hints exist in the tree anchored at <this>. This is because both child(0) and child(1) should be a <Predicate> when <this> is a CompoundPredicate.  Note a Predicate contains the selectivityHint_ as the new data member. 

If the above is true, then we do not need getLocalConjuncts() at all. 

IMHO, when selectivity hints are supplied, we should let them flow to the rest of the compilation phases as a single piece of data to represent the moment of truth. Allowing two or more representations can introduce unnecessary complexity in the down stream.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 11 Apr 2023 13:46:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 33:

(7 comments)

Looks very good!

http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@12
PS33, Line 12: , we can use this
             : hint
nit. remove and replace with "to help specify"


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@18
PS33, Line 18:  
, in the range of [0, 1],


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@29
PS33, Line 29: But pay attention, selectivity hint is invalid for 'AND' compound
             : predicates and between predicates in this patch
As a limitation of this path, the selectivity hints for 'AND' compound predicates, either in the original SQL query or internally generated, are ignored.


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@31
PS33, Line 31: this in the near future.
May file a new JIRA and mention the JIRA number here.


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@153
PS33, Line 153:         // 'AND' compound predicates will replaced by children in Expr#getConjuncts, so
nit be


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@154
PS33, Line 154:         // selectivity hint will missing, we add a warning here.
nit be


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@155
PS33, Line 155:         analyzer.addWarning("Selectivity hint is invalid for 'AND' compound predicates.");
> Currently, this warning will not appear for BETWEEN predicate, since I remo
There is one case in CaseExpr.java where the translation to AND predicate takes place. 

I think we can change the warning message to something like below to cover all such cases. 

"Selectivity hints are ignored for 'AND' compound predicates, either in the SQL query or internally generated."



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 33
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 18 Apr 2023 22:54:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9111/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 15
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 06 Mar 2023 16:08:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 21: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9169/


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 21
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 22 Mar 2023 20:46:05 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9173/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 15:48:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 20:

(25 comments)

Thanks for review, Quanlong, already modified related code according to your comments.

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/cup/sql-parser.cup@3407
PS19, Line 3407: // It's used to replace the selectivity estimated by the planner.
               : selectivity_hint ::=
> nit: I think we don't need to mention Impala since these are already Impala
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@37
PS19, Line 37: s no sense for a query.
> nit: "0 makes no sense for a query"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@39
PS19, Line 39: true if the selectivity hint is set to a valid value.
> nit: "true if the selectivity hint is set to a valid value"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@57
PS19, Line 57:     hasValidSelectivityHint_ = other.hasValidSelectivityHint_;
> We should copy hasValidSelectivityHint_ as well.
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@69
PS19, Line 69: 
> nit: Any reason we put this here instead of put it inside analyzeHints() ?
I think selectivity hint is valid for Predicate, instead of Expr, so I implements 'analyzeSelectivityHint()' in Predicate instead of Expr. But 'analyzeHints()' is in Expr, so I didn't put it inside analyzeHints(). Otherwise, we need to move 'analyzeSelectivityHint()' to Expr


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@79
PS19, Line 79: 
> nit: "larger"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@177
PS19, Line 177:   public boolean selectivityValidHintSet() {
> Can we return hasValidSelectivityHint_ directly? Also the method name can c
Done

Yes, of course. I forgot modify this method in previous patch.


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5163
PS19, Line 5163:         "Syntax error in line 1");
> Let's also test some expressions like "1/3" and very long decimal values li
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1380
PS19, Line 1380: Test SELECTIVITY hints
> nit: "new" will be stale. Might reword to "Test SELECTIVITY hints"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@1
PS19, Line 1: # Table 'tpch.lineitem' has 6001215 rows, so the scan on it has cardinality as 6.00M.
> nit: "Table 'tpch.lineitem' has 6001215 rows, so the scan on it has cardina
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@2
PS19, Line 2: If
> nit: start a new sentence with "If the selectivity of the predicate is 0.1"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@4
PS19, Line 4: Planner assigns the default selectivity (0.1) to this predicate
> nit: "Planner assigns the default selectivity (0.1) to this predicate"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@15
PS19, Line 15: 98% of the
> nit: "98% of the values"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@50
PS19, Line 50: getStats().getNumNulls() / numRows
> I'm confused with this. Shouldn't the selectivity a value lower than 1? Thi
Done

This comment seems not correct, for IS NULL predicate, selectivity is: getStats().getNumNulls() / numRows.
I already update this comment, in this way, selectivity is 0 / rows = 0.


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@51
PS19, Line 51: value
> nit: "values"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@62
PS19, Line 62: Assuming the predicate has 0.5 as the selectivity by using the hint
> nit: "Assuming the predicate has 0.5 as the selectivity by using the hint"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@73
PS19, Line 73: Planner will assign the default selectivi
> nit: It's the planner instead of "this predicate" that computes the selecti
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@84
PS19, Line 84: # The actual selectivity of this predicate is around 11.5%. Set it by the hint manually.
> nit: "The actual selectivity of this predicate is around 11.5%. Set it by t
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@95
PS19, Line 95: # Planner will assign the default selectivity (0.1) on this predicate
> nit: reword this as well
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@106
PS19, Line 106: # The actual selectivity of this LIKE predicate is around 11.5% (same as the above one).
              : # So the selectivity of the correspondin
> nit: "The actual selectivity of this LIKE predicate is around 11.5% (same a
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@118
PS19, Line 118: # Planner will assign the default selectivity (0.1) on this predicate
> nit: reword this as well
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@129
PS19, Line 129: # Planner will assign the default selectivity (0.1) on this predicate
> nit: reword or remove this
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@141
PS19, Line 141: select * from tpch.lineitem where l_shipdate <= '1998-09-02' and l_shipdate >= '1997-09-02'
> nit: reword this as well
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@209
PS19, Line 209: d add selectivity
              : # hint manually, the new join becomes:
> nit: "the planner assigns the default selectivity (0.1) to it"
Done


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@212
PS19, Line 212: 
> nit: remove "and"
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 20
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 22 Mar 2023 15:35:49 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 22:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/analysis/Predicate.java@171
PS21, Line 171:   public void setSelectivityHint(double selectivityHint) {
> Should we update hasValidSelectivityHint_ as well? Or is it enough to only 
Done

Yes, we need to update 'hasValidSelectivityHint_' and 'selectivityHint_' at same time. So I update this boolean flag  
according to input selectivityHint.


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@68
PS21, Line 68:     // Selectivity hint value of this new CompoundPredicate not been set, so inherited
> Could you add a comment for why we always set the selectivity hint regradle
Done

A new created Predicate's selectivity hint is always false, unless we use hint in sql. So whether 'BetweenPredicate' been set selectivity hint in sql or not, we can use this to replace the new created.


http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@130
PS21, Line 130: select * from tpch.lineitem
> missing the SELECT part here, which causes the test failure
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 22
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 15:47:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@11
PS23, Line 11: Hence, we add new hints to reduce such errors. Maybe in the future,
No need to mention histograms here. Instead just say this is giving a qay to override estimates.


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@14
PS23, Line 14: another
Don't say another unless you want to qualify that means


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@15
PS23, Line 15: hint to original selectivity computing.
Describe the syntax additions here at a high level. i.e. The parser will interpret expressions wrapped in () followed by a C-style /* comment */ as a predicate hint. The predicate hint currently supports +SELECTIVITY(f) where 'f' is a positive floating point number to use as the selectivity for the preceding expression.


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@17
PS23, Line 17: Format like this:
Single predicate example:


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@21
PS23, Line 21: Besides, this hint is also valid for compound predicate like this:
Compound Predicate Example:


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@25
PS23, Line 25: But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
State this first when you are describing the general syntax.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java@1068
PS23, Line 1068:         && !((CompoundPredicate) this).selectivityValidHintSet()) {
Skipping child conjuncts here based on a hint does not seem legal


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@70
PS23, Line 70:     result.setSelectivityHint(bp.getSelectivity());
Probably better to no propagate the selectivity. If user want it to apply across predicates, they should be able to wrap them in a () block.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 24 Mar 2023 19:57:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23:

(16 comments)

Hi Kurt Deschler and Xiang Yang, thanks for review. I already modify the patch as possible.

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@11
PS23, Line 11: Hence, we add new hints to reduce such errors. Maybe in the future,
> No need to mention histograms here. Instead just say this is giving a qay t
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@14
PS23, Line 14: another
> Don't say another unless you want to qualify that means
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@15
PS23, Line 15: hint to original selectivity computing.
> Describe the syntax additions here at a high level. i.e. The parser will in
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@17
PS23, Line 17: Format like this:
> Single predicate example:
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@21
PS23, Line 21: Besides, this hint is also valid for compound predicate like this:
> Compound Predicate Example:
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@25
PS23, Line 25: But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
> State this first when you are describing the general syntax.
Done


http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@26
PS23, Line 26: braket
> nit: brackets
Done


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Expr.java@1068
PS23, Line 1068:         && !((CompoundPredicate) this).selectivityValidHintSet()) {
> Skipping child conjuncts here based on a hint does not seem legal
Why? This patch only add a selectivity hint check in original if-condition, this is illegal?
As you can see, the gerrit-verify-dryrun task success. I think this additional check seem ok.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java@38
PS23, Line 38:   protected double selectivityHint_;
             :   // true if the selectivity hint is set to a valid value.
             :   protected boolean hasValidSelectivityHint_;
> From my point of view, it's not necessary to use the redundant variable 'ha
I think you are right. I remove 'hasValidSelectivityHint_' in latest patch, only reserve 'selectivityHint_', and use 'hasValidSelectivityHint()' in each kinds of Predicate.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java@183
PS23, Line 183:   public boolean selectivityValidHintSet() {
> nit: rename to hasValidSelectivityHint() ?
Done


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@70
PS23, Line 70:     result.setSelectivityHint(bp.getSelectivity());
> Probably better to no propagate the selectivity. If user want it to apply a
This rule is to transform BetweenPredicate to CompoundPredicate, which is transparent to the user. If we don't assign BetweenPredicate selectivity hint to new created CompoundPredicate, this hint value will missing due to this transformation, which means user set selectivity hint for BetweenPredicate are always invalid. I think this maybe inappropriate.


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5144
PS21, Line 5144: >
> nit: add space on both sides of the '>'. same the followings.
Done


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5144
PS21, Line 5144: >
> nit: add space on both sides of the '>'. same the followings.
Done


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5181
PS21, Line 5181: >
> nit: same as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5181
PS21, Line 5181: >
> nit: same as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@161
PS21, Line 161: ====
> Does there need a compound 'OR' predicate without hint test case?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 30 Mar 2023 08:54:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 18: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 18
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 15 Mar 2023 07:53:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 7:

Tried a slightly different sql parser and did not see any shift-reduce or reduce-reduce conflicts.

With change, it is possible to handle selectivity hints for predicate in general. Examples shown below.

Query: explain select * from functional.alltypestiny  where int_col = 1 /* +SELECTIVITY(0.5) */
+------------------------------------------------------------+
| Explain String                                             |
+------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.01MB Threads=2 |
| Per-Host Resource Estimates: Memory=20MB                   |
| Codegen disabled by planner                                |
|                                                            |
| PLAN-ROOT SINK                                             |
| |                                                          |
| 00:SCAN HDFS [functional.alltypestiny]                     |
|    HDFS partitions=4/4 files=4 size=460B                   |
|    predicates: int_col = 1                                 |
|    row-size=89B cardinality=4                              |
+------------------------------------------------------------+
Fetched 10 row(s) in 0.09s
[14:54:50 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] !vi
vi dml
[14:55:02 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] sql dml
Starting Impala Shell with no authentication using Python 2.7.16
Warning: live_progress only applies to interactive shell sessions, and is being skipped for now.
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.1.0-SNAPSHOT DEBUG (build ad29ce70b3f4cfab4105be52ef664129060fa04e)
Query: explain select * from functional.alltypestiny  where 
(int_col = 1) /* +SELECTIVITY(0.5) */
+------------------------------------------------------------+
| Explain String                                             |
+------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.01MB Threads=2 |
| Per-Host Resource Estimates: Memory=20MB                   |
| Codegen disabled by planner                                |
|                                                            |
| PLAN-ROOT SINK                                             |
| |                                                          |
| 00:SCAN HDFS [functional.alltypestiny]                     |
|    HDFS partitions=4/4 files=4 size=460B                   |
|    predicates: (int_col = 1)                               |
|    row-size=89B cardinality=4                              |
+------------------------------------------------------------+
Fetched 10 row(s) in 0.01s
[14:55:04 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] !vi
vi dml
[14:55:18 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] sql dml
Starting Impala Shell with no authentication using Python 2.7.16
Warning: live_progress only applies to interactive shell sessions, and is being skipped for now.
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.1.0-SNAPSHOT DEBUG (build ad29ce70b3f4cfab4105be52ef664129060fa04e)
Query: explain select * from functional.alltypestiny  where 
(int_col = 1 or bigint_col < 10000) /* +SELECTIVITY(0.5) */
+------------------------------------------------------------+
| Explain String                                             |
+------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.01MB Threads=2 |
| Per-Host Resource Estimates: Memory=20MB                   |
| Codegen disabled by planner                                |
|                                                            |
| PLAN-ROOT SINK                                             |
| |                                                          |
| 00:SCAN HDFS [functional.alltypestiny]                     |
|    HDFS partitions=4/4 files=4 size=460B                   |
|    predicates: (int_col = 1 OR bigint_col < 10000)         |
|    row-size=89B cardinality=1                              |
+------------------------------------------------------------+
Fetched 10 row(s) in 0.01s
[14:55:19 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] !vi
vi dml
[14:55:33 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] cp dml dml1
[14:55:35 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] vi dml1
[14:56:19 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] sql dml1
Starting Impala Shell with no authentication using Python 2.7.16
Warning: live_progress only applies to interactive shell sessions, and is being skipped for now.
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.1.0-SNAPSHOT DEBUG (build ad29ce70b3f4cfab4105be52ef664129060fa04e)
Query: explain select * from functional.alltypestiny  where 
(int_col = 1 or bigint_col < 10000) /* +SELECTIVITY(0.5) */
or 
float_col  < 2.0 /* +SELECTIVITY(0.5) */
+-----------------------------------------------------------------------+
| Explain String                                                        |
+-----------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.01MB Threads=2            |
| Per-Host Resource Estimates: Memory=20MB                              |
| Codegen disabled by planner                                           |
|                                                                       |
| PLAN-ROOT SINK                                                        |
| |                                                                     |
| 00:SCAN HDFS [functional.alltypestiny]                                |
|    HDFS partitions=4/4 files=4 size=460B                              |
|    predicates: (int_col = 1 OR bigint_col < 10000) OR float_col < 2.0 |
|    row-size=89B cardinality=1                                         |
+-----------------------------------------------------------------------+
Fetched 10 row(s) in 0.01s
[14:56:21 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] !vi
vi dml1
[14:57:04 qchen@qifan-10229: IMPALA-7942_Add_query_hints_for_cardinalities_and_selectivities_wangsheng] sql dml1
Starting Impala Shell with no authentication using Python 2.7.16
Warning: live_progress only applies to interactive shell sessions, and is being skipped for now.
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.1.0-SNAPSHOT DEBUG (build ad29ce70b3f4cfab4105be52ef664129060fa04e)
Query: explain select * from functional.alltypestiny  where 
(int_col = 1 or bigint_col < 10000) /* +SELECTIVITY(0.5) */
or 
float_col  < 2.0 /* +SELECTIVITY(0.5) */
or 
(float_col  < 2.0) /* +SELECTIVITY(0.5) */ and double_col < 0.0
+---------------------------------------------------------------------------------------------------------------+
| Explain String                                                                                                |
+---------------------------------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.01MB Threads=2                                                    |
| Per-Host Resource Estimates: Memory=20MB                                                                      |
| Codegen disabled by planner                                                                                   |
|                                                                                                               |
| PLAN-ROOT SINK                                                                                                |
| |                                                                                                             |
| 00:SCAN HDFS [functional.alltypestiny]                                                                        |
|    HDFS partitions=4/4 files=4 size=460B                                                                      |
|    predicates: (int_col = 1 OR bigint_col < 10000) OR float_col < 2.0 OR (float_col < 2.0) AND double_col < 0 |
|    row-size=89B cardinality=1                                                                                 |
+---------------------------------------------------------------------------------------------------------------+
Fetched 10 row(s) in 0.01s


explain                                                                                       
select * from functional.alltypestiny  where                                                  
(int_col = 1 or bigint_col < 10000) /* +SELECTIVITY(0.5) */                                   
or                                                                                            
float_col  < 2.0 /* +SELECTIVITY(0.5) */                                                      
or                                                                                            
(float_col  < 2.0) /* +SELECTIVITY(0.5) */ and double_col < 0.0                               
;


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 17 Dec 2021 20:01:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Amogh Margoor (Code Review)" <ge...@cloudera.org>.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java@32
PS1, Line 32:   protected double selectivityHint_;
nit: Documenting the allowed values and what would 0 and 1 mean would help here.


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@526
PS1, Line 526:           addHintWarning(hint, analyzer);
nit: Warning here should tell the correct format of specifying the HDFS_NUM_ROWS.


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1556
PS1, Line 1556:     // Use table original stats if correct, otherwise, use 'HDFS_NUM_ROWS' hint value.
Shouldn't user provided hint be given higher preference ?


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4971
PS1, Line 4971:   public void testCardinalitySelectivityHintsNegative() {
What happens when we provide SELECTIVITY hint for Join predicates ?


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4974
PS1, Line 4974:     AnalyzesOk("select * from tpch.lineitem /* +HDFS_NUM_ROWS */ where " +
Can we include tests where SELECTIVITY is applied on expressions having scalar subquery like  'x > (select a from foo)' ?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 1
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 15 Nov 2021 12:09:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  * select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows, even if
table has stats.

For 'SELECTIVITY' hint, we can use in these 'Predicate':
  * BinaryPredicate
  * InPredicate
  * IsNullPredicate
  * LikePredicate, including 'not like' syntax
  * BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These formats
are not allowed:

  * select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  * select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  * select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  * select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need to ensure that each child selectivity has been set by hint or
computable. Otherwise, this hint might not take the expected effect.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
14 files changed, 1,635 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10570/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 13 May 2022 09:07:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 8:

> Code modification to facilitate above general version of grammar to
 > handle predicates and selectivity hints.
 > 
 > Since we put the hint at the class Predicate, it looks like we can
 > get the hint for any simple or complex predicate at the right
 > place.
 > 
 > [15:00:23 qchen@qifan-10229: Impala.07202021] git diff
 > diff --git a/fe/src/main/cup/sql-parser.cup b/fe/src/main/cup/sql-parser.cup
 > index bbd1e53d0..bef0b3c1d 100644
 > --- a/fe/src/main/cup/sql-parser.cup
 > +++ b/fe/src/main/cup/sql-parser.cup
 > @@ -307,7 +307,7 @@ terminal
 > KW_RANGE, KW_RCFILE, KW_RECOVER, KW_REFERENCES, KW_REFRESH,
 > KW_REGEXP, KW_RELY,
 > KW_RENAME, KW_REPEATABLE, KW_REPLACE, KW_REPLICATION, KW_RESTRICT,
 > KW_RETURNS,
 > KW_REVOKE, KW_RIGHT, KW_RLIKE, KW_ROLE, KW_ROLES, KW_ROLLUP,
 > KW_ROW, KW_ROWS, KW_SCHEMA,
 > -  KW_SCHEMAS, KW_SELECT, KW_SEMI, KW_SEQUENCEFILE,
 > KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
 > +  KW_SCHEMAS, KW_SELECT,  KW_SELECTIVITY, KW_SEMI,
 > KW_SEQUENCEFILE, KW_SERDEPROPERTIES, KW_SERIALIZE_FN,
 > KW_SET, KW_SHOW, KW_SMALLINT, KW_SETS, KW_SORT, KW_SPEC, KW_STORED,
 > KW_STRAIGHT_JOIN,
 > KW_STRING, KW_STRUCT, KW_SYMBOL, KW_SYSTEM_TIME, KW_SYSTEM_VERSION,
 > KW_TABLE, KW_TABLES, KW_TABLESAMPLE, KW_TBLPROPERTIES,
 > @@ -431,7 +431,7 @@ nonterminal TimeTravelSpec opt_asof;
 > nonterminal Subquery subquery;
 > nonterminal JoinOperator join_operator;
 > nonterminal opt_inner, opt_outer;
 > -nonterminal PlanHint plan_hint;
 > +nonterminal PlanHint plan_hint, selectivity_hint;
 > nonterminal List<PlanHint> plan_hints, opt_plan_hints,
 > plan_hint_list;
 > nonterminal TypeDef type_def;
 > nonterminal Type type;
 > @@ -629,6 +629,8 @@ precedence left KW_INTO;
 > 
 > precedence left KW_OVER;
 > 
 > +precedence left COMMENTED_PLAN_HINT_START;
 > +
 > start with stmt;
 > 
 > stmt ::=
 > @@ -3720,6 +3722,15 @@ function_params ::=
 > {: RESULT = new FunctionParams(false, true, exprs); :}
 > ;
 > 
 > +selectivity_hint ::=
 > +  COMMENTED_PLAN_HINT_START KW_SELECTIVITY LPAREN
 > DECIMAL_LITERAL:value RPAREN
 > +      COMMENTED_PLAN_HINT_END
 > +  {:
 > +    RESULT = new PlanHint("selectivity",
 > +      new ArrayList(Arrays.asList(value.toString())));
 > +  :}
 > +  ;
 > +
 > predicate ::=
 > expr:e KW_IS KW_NULL
 > {: RESULT = new IsNullPredicate(e, false); :}
 > @@ -3746,6 +3757,8 @@ predicate ::=
 > :}
 > | expr:e1 KW_LOGICAL_OR expr:e2
 > {: RESULT = new CompoundVerticalBarExpr(e1, e2); :}
 > +  | predicate:p selectivity_hint:h
 > +  {: RESULT = p; :}
 > ;
 > 
 > comparison_predicate ::=
 > diff --git a/fe/src/main/jflex/sql-scanner.flex b/fe/src/main/jflex/sql-scanner.flex
 > index e60676a9c..bc3a91513 100644
 > --- a/fe/src/main/jflex/sql-scanner.flex
 > +++ b/fe/src/main/jflex/sql-scanner.flex
 > @@ -236,6 +236,7 @@ import org.apache.impala.thrift.TReservedWordsVersion;
 > keywordMap.put("schemas", SqlParserSymbols.KW_SCHEMAS);
 > keywordMap.put("select", SqlParserSymbols.KW_SELECT);
 > keywordMap.put("semi", SqlParserSymbols.KW_SEMI);
 > +    keywordMap.put("selectivity", SqlParserSymbols.KW_SELECTIVITY);
 > keywordMap.put("sequencefile", SqlParserSymbols.KW_SEQUENCEFILE);
 > keywordMap.put("serdeproperties", SqlParserSymbols.KW_SERDEPROPERTIES);
 > keywordMap.put("serialize_fn", SqlParserSymbols.KW_SERIALIZE_FN);

Besides, different predicates have different code logic. For example, we can use 'p.setSelectivityHint(Double.valueOf(hint.getArgs().get(0)))' for a normal binary predicate, but for 'NOT LIKE' predicate, Impala will transform to 'CompoundPredicate' with 'NOT' operator, so we set '1 - hint_value' for it's child, and this is what I did above. Maybe we need consider more of supporting selectivity hint for compund predicates.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 19 Mar 2022 10:37:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> Consider Expr.isTriviallyTrue() for example in the same file. That currentl
Hi Kurt, I think you are right. I will discuss with other people to find a better way to solve this problem.


http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> The concern makes sense to me. Expr#getConjuncts() is widely used. Chaning 
Hi Quanlong, you are right, this change only required by hints on compound predicates. First version of this patch only supported single predicate. Qifan suggested that we'd better support compound predicates as well which are often used in prod env, so I try to add this in later change. I will discuss with Qifan again, maybe we do not provide hint for AND compound predicates in this patch.


http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1067
PS29, Line 1067: // Selectivity hint cannot be set. If Predicate been set selectivity hint, we return
               :       // itself directly, such as CompoundPredicate. Otherwise, hint value will missing.
> This comment is not accurate and should  be removed. The original comment f
Hi Qifan. What do you mean 'Note a Predicate contains the selectivityHint_ as the new data member.'? And 'If the above is true, then we do not need getLocalConjuncts() at all.'? I don't understand.
If we do not add hint check in 'getConjuncts()', hint for AND compound predicates will invalid, but still valid for OR compound predicates. So maybe we have these options:
1. Remove hint check directly, do not support hint for AND compound predicates, and maybe give some warnings;
2. Transfer hint value to AND compound predicate's children, but how to decide the hint value of each child is a problem.
Here is my opinion, how do you think?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 11 Apr 2023 16:25:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
This should be checked by the caller. I there is only one caller, maybe better to have a boolean flag i.e. includeChildConjuncts.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 06 Apr 2023 21:12:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@155
PS33, Line 155:         analyzer.addWarning("Selectivity hint is invalid for 'AND' compound predicates.");
Will this appear for BETWEEN predicate also?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 33
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 17 Apr 2023 16:45:05 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 1:

Hi Zoltan and Quanlong, I've already finished this feature. Hope you can give some suggestions, thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 1
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sun, 14 Nov 2021 08:03:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  * select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows, even if
table has stats.

For 'SELECTIVITY' hint, we can use in these 'Predicate':
  * BinaryPredicate
  * InPredicate
  * IsNullPredicate
  * LikePredicate, including 'not like' syntax
  * BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These formats
are not allowed:

  * select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  * select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  * select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  * select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need to ensure that each child selectivity has been set by hint or
computable. Otherwise, this hint might not take the expected effect.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
16 files changed, 1,588 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/9
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 3:

(15 comments)

Thanks for working on this functionality!

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10
PS3, Line 10: this maybe lead
nit: this may lead


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@34
PS3, Line 34: format
nit: formats


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@45
PS3, Line 45: -1
isn't second 0.1?
Based on IMPALA-7601


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@48
PS3, Line 48: is been set
nit: has been set / is set


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@48
PS3, Line 48: need ensure
nit: need to ensure


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: maybe does not take effect as you
            : expected.
nit: might not take the expected effect


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/cup/sql-parser.cup@3763
PS1, Line 3763: 
including


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3260
PS3, Line 3260: computing
nit: computed


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3829
PS3, Line 3829: incliding
nit: including


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/cup/sql-parser.cup@3831
PS3, Line 3831: not_like_predicate
Could you please add a test for not like predicate predicate-selectivity-hint.test?


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/Predicate.java@77
PS3, Line 77:         analyzer.addWarning("Selectivity hint is allowed for single column predicate");
nit: maybe add more information about this predicate in the warning.


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1556
PS1, Line 1556: 
> In my opinion, table rows is precise, unless table not stats or has corrupt
I don't have a strong opinion here, just want to mention that table stats can be stale. Though in that case we can issue COMPUTE STATS, or update the stats manually via ALTER TABLE stmt.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5008
PS3, Line 5008: AnalyzesOk
Why is it not an analysis error?


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5050
PS3, Line 5050: (0)
Don't we allow zero as selectivity hint? E.g. at L5060 we say 'allowed value is [0,1]'


http://gerrit.cloudera.org:8080/#/c/18023/3/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/3/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@643
PS3, Line 643: 2.87M
Above we have 2.42M for 0.5 selectivity.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 07 Dec 2021 19:16:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 17:

All tests passed, here is the jenkins url:
https://jenkins.impala.io/job/pre-review-test/1519/


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 17
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 10 Mar 2023 07:02:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 34:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9234/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 19 Apr 2023 13:03:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors.

This patch adds a new query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

The parser will interpret expressions wrapped in () followed by a
C-style /* comment */ as a predicate hint. The predicate hint currently
supports +SELECTIVITY(f) where 'f' is a positive floating point number
to use as the selectivity for the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound Predicate Example:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 429 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/24
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9195/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 03 Apr 2023 07:41:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 12:

Can you please provide feedback on all comments? 

For example, for the  following comment, a feedback can be DONE, a reply etc. 

Commit Message
Line 14:
TABLE_NUM_ROWS?

The feedbacks are useful to judge the rework. Thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 20 Feb 2023 13:22:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 25: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 30 Mar 2023 14:18:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 06 Apr 2023 12:49:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> This should be checked by the caller. I there is only one caller, maybe bet
Sorry I didn't understand what you mean, can you show me the detail modification? In this way, I can modify the code following your suggestion.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 07 Apr 2023 04:21:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 20:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12667/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 20
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 22 Mar 2023 15:55:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 22:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12682/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 22
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 16:07:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has removed Fucun Chu from this change.  ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Removed reviewer Fucun Chu.
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 25:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9185/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 30 Mar 2023 09:01:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@155
PS33, Line 155:         analyzer.addWarning("Selectivity hint is invalid for 'AND' compound predicates.");
> Will this appear for BETWEEN predicate also?
Currently, this warning will not appear for BETWEEN predicate, since I removed selectivity assgined in 'BetweenToCompoundRule.java' when transform BetweenPredicate to CompoundPredicate.

If we want to appear for BETWEEN predicate also, we need to add code in 'BetweenToCompoundRule.java' like this:
CompoundPredicate result = new CompoundPredicate(compoundOperator, lower, upper);
result.setSelectivityHint(bp.getSelectivity());
return result;

How do you think?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 33
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 18 Apr 2023 02:30:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#29). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY', we can use this
hint to specify a selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number to use as the selectivity for
the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 455 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/29
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 334 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/12
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9111/


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 15
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 06 Mar 2023 21:16:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 18: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 18
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 21 Mar 2023 02:04:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 19:

(25 comments)

Thanks for your continous work on this, Sheng Wang! There are some merge conflicts. Could you rebase the patch to the latest master branch?

The only issue I found is we need to copy hasValidSelectivityHint_ in the constructor of Predicate. Also left some tiny comments.

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/cup/sql-parser.cup@3407
PS19, Line 3407: // If set, Impala will use this value to replace original selectiviy
               : // computed in sql analysis phase.
nit: I think we don't need to mention Impala since these are already Impala codes. We can make it shorter, e.g.

"It's used to replace the selectivity estimated by the planner."


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@37
PS19, Line 37: 0 is make no sense for a query
nit: "0 makes no sense for a query"


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@39
PS19, Line 39: true if selectivity predicate been set as a valid value
nit: "true if the selectivity hint is set to a valid value"


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@57
PS19, Line 57:   }
We should copy hasValidSelectivityHint_ as well.


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@69
PS19, Line 69:     analyzeSelectivityHint(analyzer);
nit: Any reason we put this here instead of put it inside analyzeHints() ?


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@79
PS19, Line 79: bigger
nit: "larger"


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/main/java/org/apache/impala/analysis/Predicate.java@177
PS19, Line 177:     return selectivityHint_ > 0 && selectivityHint_ <= 1.0;
Can we return hasValidSelectivityHint_ directly? Also the method name can change to hasValidSelectivityHint().


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5163
PS19, Line 5163:         "Syntax error in line 1");
Let's also test some expressions like "1/3" and very long decimal values like "0.3333333333333333333333333333333333".


http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/19/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1380
PS19, Line 1380: Test new hint of 'SELECTIVITY'
nit: "new" will be stale. Might reword to "Test SELECTIVITY hints"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@1
PS19, Line 1: # Table 'tpch.lineitem' records count is 6001215, so cardinality is 6.00M,
nit: "Table 'tpch.lineitem' has 6001215 rows, so the scan on it has cardinality as 6.00M."


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@2
PS19, Line 2: if
nit: start a new sentence with "If the selectivity of the predicate is 0.1"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@4
PS19, Line 4: This predicate cannot compute selectivity, default value is 0.1
nit: "Planner assigns the default selectivity (0.1) to this predicate"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@15
PS19, Line 15: 98% values
nit: "98% of the values"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@50
PS19, Line 50: numRows - getStats().getNumNulls()
I'm confused with this. Shouldn't the selectivity a value lower than 1? This seems the cardinality of the scan with "l_shipdate IS NOT NULL".


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@51
PS19, Line 51: value
nit: "values"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@62
PS19, Line 62: We assume that this predicate actual selectivity is 0.5 for testing, and set hint manually
nit: "Assuming the predicate has 0.5 as the selectivity by using the hint"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@73
PS19, Line 73: This predicate cannot compute selectivity
nit: It's the planner instead of "this predicate" that computes the selectivity. Might reword to "Planner will assign the default selectivity (0.1) on this predicate"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@84
PS19, Line 84: # This predicate's actual selectivity is almost 11.5%, we set hint manually
nit: "The actual selectivity of this predicate is around 11.5%. Set it by the hint manually."


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@95
PS19, Line 95: # This predicate cannot compute selectivity, default value is 0.1
nit: reword this as well


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@106
PS19, Line 106: # This like predicate actual selectivity is almost 11.5% as above, so not like
              : # predicate actual selectivity is 88.5%.
nit: "The actual selectivity of this LIKE predicate is around 11.5% (same as the above one). So the selectivity of the corresponding NOT LIKE predicate is 88.5%"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@118
PS19, Line 118: # This predicate cannot compute selectivity, default value is 0.1
nit: reword this as well


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@129
PS19, Line 129: # We assume that this predicate actual selectivity is 0.5 for testing, and set hint manually
nit: reword or remove this


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@141
PS19, Line 141: # This predicate cannot compute selectivity, default value is 0.1
nit: reword this as well


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@209
PS19, Line 209: Impala can not compute selectivity
              : # Impala can not compute selectivity and assumes the default value of 0.1
nit: "the planner assigns the default selectivity (0.1) to it"


http://gerrit.cloudera.org:8080/#/c/18023/19/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@212
PS19, Line 212: and
nit: remove "and"



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 19
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 22 Mar 2023 03:00:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#32). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY', we can use this
hint to specify a selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number to use as the selectivity for
the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 or b=2) /* +SELECTIVITY(0.5) */;

But pay attention, selectivity hint is invalid for 'AND' compound
predicates and between predicates in this patch. We may supported
this in the near future.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
10 files changed, 412 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/32
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 32
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

(23 comments)

Great test section. I completed the review on the section and added some comments, most of them are minor. 

Seems this patch can provide another simple and powerful tool for people to better optimize queries.

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5053
PS9, Line 5053: "Syntax error in line 1"
This error may not be user friendly for super long SQL query.

Could we report the error specifically as follows?

Table hint not recognized for <T>: a negative value (-1).


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5056
PS9, Line 5056: "Syntax error in line 1")
same comment as above.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5060
PS9, Line 5060:  "Syntax error in line 1"
same comment as above.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5064
PS9, Line 5064: table
nit. may add a qualifier "non-HDFS" before 'table' to make it clear.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5071
PS9, Line 5071: "Syntax error in line 1"
See the comment on TABLE_NUM_ROWS for "syntax error in line 1".


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5085
PS9, Line 5085:   "Syntax error in line 1")
is this a mistake? 0.1 is a perfect selectivity value.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5087
PS9, Line 5087: Syntax error in line 1"
Same as above. 0 is okay.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5093
PS9, Line 5093: (
should be [0,1], right?


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5098
PS9, Line 5098: 0.0
See the comment about. Should be allowed.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5102
PS9, Line 5102: l_shipdate <= (select '1998-09-02') " +
              :             "/* +SELECTIVITY(0.5)
From the updated parser, it seems this hint should be accepted in that the entire predicate is recognized first l_shipdate <= (select '1998-09-02'), followed by the hint.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test@90
PS9, Line 90: DISTRIBUTEDPLAN
same argument as for selectivity hints. We probably do not need to test PARALLEL and DISTRIBUTED plan.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test@261
PS9, Line 261: 100000
Maybe in the future we could allow short-hand notation such as 100K, 10G etc.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@2
PS9, Line 2: default value is 0.1
Please add a comment after it to indicate the cardinality clause in the scan node 00 shows the actual value after the selectivity of 0.1 is applied: 6M X 0.01 = 600.12K.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@70
PS9, Line 70: little
nit. Since almost 98% values are less than '1998-09-02', we set


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@95
PS9, Line 95: 5.88M
nice. 

I agree with QiangLong that we only need the serial plan in this test as the distributed plan is generated well after the cardinality is determined.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@344
PS9, Line 344: There are no null value in 'l_shipdate' column, so this selectivity is 0
nit. Delete this sentence and add "Here we assume ...".


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@481
PS9, Line 481: predicate
nit. predicate's


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@643
PS9, Line 643: 5.31M
nice!


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@754
PS9, Line 754: This predicate cannot compute selectivity, default value is 0.1
nit. Delete this sentence and add "Here we assume ...".


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@780
PS9, Line 780: 3.57M
Can you double check? It seems to me 3M should be the right number. 

See the test case for the following query 

select count(1) from tpch.lineitem
where l_shipdate IS NULL /* +SELECTIVITY(0.5) */

at line 345.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@822
PS9, Line 822: A simple example for selectivity hint to change join mode, just for testing!
nit. A simple example to show that selectivity hint can help change join mode and be used as an optimization tool.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@826
PS9, Line 826: Default value is 0.1,
nit. Impala can not compute selectivity and assumes the default value of 0.1.  If we assume the actual ...


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@827
PS9, Line 827: will be
nit. becomes



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 07 Sep 2022 17:35:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9767/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 1
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Nov 2021 16:59:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

Hi Quanlong, Qifan and Zoltan, I've already modify patch to support selectivity hint for CompoundPredicate

1. We need add bracket for each predicate to use selectivity hint, such as: (a = 1) /* +SELECTIVITY(0.2)*/, (a = 1 AND b = 2) /* +SELECTIVITY(0.2)*/, bracket is necessary due to CompoundPredicate supporting;
2. This syntax is allowed: ((a = 1) /* +SELECTIVITY(0.2)*/) /* +SELECTIVITY(0.3)*/, and the outer selectivity will override inner selectivity;
3. FE planner will keep 'AND' CompoundPredicate as a conjunct when setting selectivity hint, instead of split to two BinaryPredicate.

I will modify test cases and commit msg after we discuss this adjust. Hope for your advice.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 13 May 2022 08:48:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/1//COMMIT_MSG@21
PS1, Line 21: hint value only valid when table does not have stats or stats is corrupt.
nit: line should have 72 or fewer characters



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 1
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 15 Nov 2021 14:34:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows, even if
table has stats.

For 'SELECTIVITY' hint, we can use in these predicates:
* BinaryPredicate
* InPredicate
* IsNullPredicate
* LikePredicate, including 'not like' syntax
* BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These formats
are not allowed:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need to ensure that each child selectivity has been set by hint or
computable. Otherwise, this hint might not take the expected effect.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
14 files changed, 1,626 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 8:

"Maybe I can trytrywang this. Here is another question I want to discuss with you. If we support set selectivity hint for compound predicate."

Sounds like a good idea. 


"For 'a=1 /* +SELECTIVITY(0.1) */', this selectivity is definately belong to 'a=1'
For 'a=1 and (b=2 /* +SELECTIVITY(0.1) */)', this selectivity is definately belong to 'b=2'
For '(a=1 and b=2) /* +SELECTIVITY(0.1) */', this selectivity is definately belong to 'a=1 and b=2'"

Yeah, this is about right. 


"But for these cases:
1. For 'a=1 and b=2 /* +SELECTIVITY(0.1) */'
2. For '(a=1 and b=2 /* +SELECTIVITY(0.1) */)'
3. For '(a=1) and (b=2) /* +SELECTIVITY(0.1) */
Selectivity should belong to which predicate? 'b=2' or 'a=1 and b=2'?"

Intuitively, I would think the hint should associate with b=2. Does the parser give the same result?


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 25 Mar 2022 13:36:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9853/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 2
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 30 Nov 2021 09:32:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 334 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/10
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 10
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12386/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 16 Feb 2023 06:43:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 385 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/14
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 14
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#20). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 418 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/20
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 20
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 27: Code-Review+1

(8 comments)

Looks great.  Just have some minor comments on the commit message.

http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@9
PS27, Line 9: Currently, Impala only uses simple estimation to compute selectivity
nit. stop the sentence here (.)


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@10
PS27, Line 10: for some predicates, and this may lead
nit. 

For some predicates, this may lead to ...


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@11
PS27, Line 11: Hence, we add new hints to reduce such errors.
nit. The next sentence at line 13 mentions the specific work done in this patch. I think we can remove this sentence.


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@13
PS27, Line 13: s
nit. hint.


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@14
PS27, Line 14: original selectivity computing
nit. specify a selectivity value for a predicate.


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@17
PS27, Line 17: C-style /* comment */
nit. C-style comment /* <predicate hint> */


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@17
PS27, Line 17: currently
             : supports
nit. can be in the form of


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@25
PS27, Line 25: Predicate Example
nit. predicate example



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 27
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sun, 02 Apr 2023 13:29:08 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> Consider Expr.isTriviallyTrue() for example in the same file. That currentl
The concern makes sense to me. Expr#getConjuncts() is widely used. Chaning its behavior might unintentionally impact other optimizations, e.g. rewrite of ExtractCommonConjunct or predicate pushdown. We need to carefully examine the related code path.

Is this change only required by hints on compound predicates? If so, we can split this patch into two and merge the support on single predicate hints first.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 11 Apr 2023 01:06:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 32:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12789/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 32
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 13 Apr 2023 14:04:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#34). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY' to help specify a
selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number, in the range of (0, 1], to
use as the selectivity for the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 or b=2) /* +SELECTIVITY(0.5) */;

As a limitation of this path, the selectivity hints for 'AND' compound
predicates, either in the original SQL query or internally generated,
are ignored. We may supported this in the near future.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
10 files changed, 414 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/34
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 13:

(29 comments)

> Can you please provide feedback on all comments?
 > 
 > For example, for the  following comment, a feedback can be DONE, a
 > reply etc.
 > 
 > Commit Message
 > Line 14:
 > TABLE_NUM_ROWS?
 > 
 > The feedbacks are useful to judge the rework. Thanks!

Ok, Qifan, thanks for reply. I've already answer all comments.

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java@78
PS9, Line 78: ti
> nit. should be a double value in (0, 1].
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@547
PS9, Line 547:           return;
> The above hints are all limited to hdfs tables since they are related to hd
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@554
PS9, Line 554:     }
> nit. for this HDFS table reference. On paper, one can assign different # ro
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@555
PS9, Line 555: 
> It seems we need to handle NumberFormatException thrown from this method. F
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@81
PS9, Line 81: H
> nit. "."
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@82
PS9, Line 82: ng tableNumRowsHint_ = -1;
            : 
> I wonder if this is still correct. I thought the hint can be used to overri
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5053
PS9, Line 5053:  void testCreatePartitio
> This error may not be user friendly for super long SQL query.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5056
PS9, Line 5056: "PARTITIONED BY SPEC (BUC
> same comment as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5060
PS9, Line 5060:  "STORED AS ICEBERG" + tb
> same comment as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5064
PS9, Line 5064:  int,
> nit. may add a qualifier "non-HDFS" before 'table' to make it clear.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5071
PS9, Line 5071: "PARTITIONED BY SPEC (BU
> See the comment on TABLE_NUM_ROWS for "syntax error in line 1".
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5074
PS9, Line 5074:         "PARTITIONED BY SPEC (TRUNCATE(0, p1), DAY(p2)) STORED AS ICEBERG" +
> Isn't this supported now in patch set 9?
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5085
PS9, Line 5085:  Legal hint return correct 
> is this a mistake? 0.1 is a perfect selectivity value.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5087
PS9, Line 5087: zesOk("select * from tp
> Same as above. 0 is okay.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5093
PS9, Line 5093:  
> should be [0,1], right?
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5098
PS9, Line 5098: gal
> See the comment about. Should be allowed.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5102
PS9, Line 5102: 
              :     // Multiple illegal hints wil
> From the updated parser, it seems this hint should be accepted in that the 
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test@90
PS9, Line 90: 
> same argument as for selectivity hints. We probably do not need to test PAR
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@2
PS9, Line 2: CAN HDFS' is 6001215
> Please add a comment after it to indicate the cardinality clause in the sca
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@70
PS9, Line 70: ty=3.0
> nit. Since almost 98% values are less than '1998-09-02', we set
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@95
PS9, Line 95: y, de
> nice. 
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@344
PS9, Line 344: 
> nit. Delete this sentence and add "Here we assume ...".
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@481
PS9, Line 481: 
> nit. predicate's
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@643
PS9, Line 643: 
> nice!
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@754
PS9, Line 754: 
> nit. Delete this sentence and add "Here we assume ...".
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@780
PS9, Line 780: 
> Can you double check? It seems to me 3M should be the right number. 
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@822
PS9, Line 822: 
> nit. A simple example to show that selectivity hint can help change join mo
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@826
PS9, Line 826: 
> nit. Impala can not compute selectivity and assumes the default value of 0.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@827
PS9, Line 827: 
> nit. becomes
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 25 Feb 2023 04:03:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 2:

(6 comments)

Thanks for cr. I've already modify the code. Sorry for late reply due to bug which already fixed in IMPALA-11021.

http://gerrit.cloudera.org:8080/#/c/18023/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/1//COMMIT_MSG@21
PS1, Line 21: hint value only valid when table does not have stats or stats is
> nit: line should have 72 or fewer characters
Done


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java@32
PS1, Line 32:   // The allowed values is [0,1], 1 means all records are eligible, 0 mean all records
> nit: Documenting the allowed values and what would 0 and 1 mean would help 
Done


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@526
PS1, Line 526:           return;
> nit: Warning here should tell the correct format of specifying the HDFS_NUM
Done


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1556
PS1, Line 1556: 
> Shouldn't user provided hint be given higher preference ?
In my opinion, table rows is precise, unless table not stats or has corrupt stats. This is different from selectivity. For selectiviy, Impala use very simple computing, and this may lead to worse plan even table has precise stast, so selectivity hint is higher priority, but num_rows has lower priority. How do you think?


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4971
PS1, Line 4971:   public void testPredicateHint() {
> What happens when we provide SELECTIVITY hint for Join predicates ?
Cool! I didn't consider this situation before, and added a check for single column predicate. If 'Predicate' is not single column predicate, Impala will print a warning msg, and this hint is invalid for this predicate.


http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4974
PS1, Line 4974:     AnalyzesOk("select * from tpch.lineitem where /* +ALWAYS_TRUE */ " +
> Can we include tests where SELECTIVITY is applied on expressions having sca
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 2
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 30 Nov 2021 09:12:30 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> Thanks for you deep research, Qifan. As you mentioned above:
"If we only handle a subset of complex expressions, this will confused users.". 

Not sure the argument holds :-) as currently the patch allows selectivity hints to be applied to a subset of predicates anyway.

On paper, suppose we have a tree represent a predicate where non-leaf nodes are operators, and leaf nodes columns/values etc.  Let us say selectivity is the property of any nodes in the tree. Then a selectivity hint at node <n> just specifies that selectivity value and overrides the need to compute it at all from the subtree rooted at <n>. The selectivity for the entire predicate can be computed as before. 

This view is a little bit different than the logic in the patch, and maybe can serve as the model for computation with complex predicates. 

For predicate 'int_col > 1 and smallint_col > 3'. If there is a hint for the whole predicate, can we wrap the whole thing in CompoundPredicate()? If a hint is only available for int_col >1, then we use the form of two BinaryPredicates, and reply on PlanNode. BinaryPredicate() to compute a final score. Note that we may  not apply the backoff logic there for a conjunct with hint.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 15 Dec 2021 17:02:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows, even if
table has stats.

For 'SELECTIVITY' hint, we can use in these predicates:
* BinaryPredicate
* InPredicate
* IsNullPredicate
* LikePredicate, including 'not like' syntax
* BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These formats
are not allowed:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need to ensure that each child selectivity has been set by hint or
computable. Otherwise, this hint might not take the expected effect.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
14 files changed, 1,635 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: * InPredicate
> Sorry, I don't understand, can you explain this?
I meant to say in primitive form.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: ' with two 'BinaryPredicate', if
            : set hint 
> Thanks for advice, Qifan. I also consider about this, but here are some pro
Thanks a lot for the research. It is great. 

It seems the data representation of the conjuncts when PlanNode.computeCombinedSelectivity() is called can be the following.

1. Conjuncts: as a list of primitive predicates. 

select * from functional.alltypes where int_col > 1 and smallint_col > 3                   
                                                                                              
2904 E1213 10:09:09.713835 2970698 PlanNode.java:648] 294617a1c2347b6c:c5b2d79400000000] computeCombinedSelectivity(): conjuncts=
BinaryPredicate{op=>, exprid=0 SlotRef{label=int_col, path=int_col, type=INT, id=4} NumericLiteral{value=1, type=INT}}
BinaryPredicate{op=>, exprid=1 SlotRef{label=smallint_col, path=smallint_col, type=SMALLINT, id=3} NumericLiteral{value=3, type=SMALLINT}}

2. Disjuncts: A list of single CompoundPredicate with nesting.

select * from functional.alltypes where                                                 
int_col > 1 or smallint_col > 3 or                                                         
int_col > 10 or smallint_col < 30                                                          
                                                                                           
CompoundPredicate{op=OR, exprid=0                                                          
  CompoundPredicate{op=OR,                                                                 
    CompoundPredicate{op=OR,                                                               
       BinaryPredicate{op=>, SlotRef{label=int_col, path=int_col, type=INT, id=4} NumericLiteral{value=1, type=INT}} 
       BinaryPredicate{op=>, SlotRef{label=smallint_col, path=smallint_col, type=SMALLINT, id=3} NumericLiteral{value=3, type=SMALLINT}}                                                                                    
   }                                                                                       
   BinaryPredicate{op=>, SlotRef{label=int_col, path=int_col, type=INT, id=4} NumericLiteral{value=10, type=INT}}
 }                                                                                         
 BinaryPredicate{op=<, SlotRef{label=smallint_col, path=smallint_col, type=SMALLINT, id=3} NumericLiteral{value=30, type=SMALLINT}}
}      

Since the selectivity hint is already represented at Predicate, maybe we could handle a subset of complex expressions where a single complex expression C is represented as a single item (as shown in case 2 above). In this case, we just return the value from the hint directly in stead of performing the computation in PlanNode.computeCombinedSelectivity().

When we receive a list of conjunct predicates (as shown in case 1 above) with a hint, I do not have a good answer on mapping the list back to the hint. Maybe we can make use of CompoundPredicate() which can deal with AND by design?

On  'shift/reduce conflicts', I wonder how we deal with it for the case (a >1) /* +selectivity 0.4 */.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51
PS3, Line 51: value for two 'BinaryPredicate' children.
            : 
            : Testing:
            : - Added new fe tests in 'PlannerTest'
> I understand, this computing maybe not perfect. But if we set selectivity f
See my comment on complex predicates above.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 6
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 13 Dec 2021 15:59:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 10:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12356/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 10
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 11 Feb 2023 09:15:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 13:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12444/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 25 Feb 2023 04:22:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 13: Code-Review+1

Thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 28 Feb 2023 14:32:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9144/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 18
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 15 Mar 2023 02:43:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 427 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/22
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 22
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 21:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9169/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 21
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 22 Mar 2023 15:36:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 15: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 15
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 06 Mar 2023 16:08:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 34: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 19 Apr 2023 18:17:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#28). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity.
For some predicates, this may lead to worse query plan due to CBO.

This patch adds a new query hint: 'SELECTIVITY', we can use this
hint to specify a selectivity value for a predicate.

The parser will interpret expressions wrapped in () followed by a
C-style comment /* <predicate hint> */ as a predicate hint. The
predicate hint currently can be in the form of +SELECTIVITY(f) where
'f' is a positive floating point number to use as the selectivity for
the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound predicate example:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 446 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/28
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28:

(8 comments)

Thanks for review, Qifan.

http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@9
PS27, Line 9: Currently, Impala only uses simple estimation to compute selectivity.
> nit. stop the sentence here (.)
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@10
PS27, Line 10: For some predicates, this may lead to 
> nit. 
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@11
PS27, Line 11: 
> nit. The next sentence at line 13 mentions the specific work done in this p
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@13
PS27, Line 13: l
> nit. hint.
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@14
PS27, Line 14: 
> nit. specify a selectivity value for a predicate.
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@17
PS27, Line 17: predicate hint curren
> nit. C-style comment /* <predicate hint> */
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@17
PS27, Line 17:  where
             : 'f' is a
> nit. can be in the form of
Done


http://gerrit.cloudera.org:8080/#/c/18023/27//COMMIT_MSG@25
PS27, Line 25: predicate example
> nit. predicate example
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Mon, 03 Apr 2023 07:39:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 24:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12715/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 30 Mar 2023 09:16:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 3:

(16 comments)

The idea behind this patch provides great value and I like it a lot.

My concern for this patch is that the hint can only be applied to simple predicate. I wonder if we can allow hints to complex predicates, such as

select * from t where (t.a > 1 or t.a < 1000) /* +selectivity (0.4) */

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@9
PS3, Line 9: use
nit uses


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10
PS3, Line 10: , other predicates
nit. remove?


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@11
PS3, Line 11: set these stats
            : manually in query to help us get better CBO.
nit. to reduce such errors.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@20
PS3, Line 20: But
nit remove


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@21
PS3, Line 21: hint value only valid when table does not have stats or stats is
nit is valid


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@22
PS3, Line 22: corrupt. Otherwise, Impala will use table original stats.
I think we should raise a warning when the hint is not used.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: :
in non-compounding form.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: For 'SELECTIVITY' hint, we can use in these predicates:
nit. types of


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: maybe does not take effect as you
            : expected.
> nit: might not take the expected effect
I wonder why we can not assign a selectivity to a complex predicate in this patch, as it is usually difficult to figure such selectivity out.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51
PS3, Line 51: Another thing, for 'BetweenPredicate', Impala will transfom this
            : predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
            : set hint for 'BetweenPredicate' in query, we will split this hint
            : value for two 'BinaryPredicate' children.
I wonder if this is correct. 

The selectivity at the BETWEEN level has nothing to do with the selectivity  at the two transformed binary predicates.

If we support selectivity for complex predicate, we do not have to "push" the selectivity down.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@248
PS3, Line 248: selectivity_ > 0
nit. This prevents a 0 selectivity. Maybe use -2 to indicate non-computable selectivity?


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java
File fe/src/main/java/org/apache/impala/analysis/InPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java@177
PS3, Line 177: elect
same comment as for BinaryPredicate.java


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
File fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java@143
PS3, Line 143: electivity
same comment as for BinaryPredicate.java


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@175
PS3, Line 175: hdfsNumRowsHint_
I wonder we can use a more generic name numRowsHint_ as the hint can be applied to other types of tables equally well. 

We can apply the hint to HDFS only tables in this patch and a check is already in place in this file: 

494     // BaseTableRef will always have their path resolved at this point.                    
495     Preconditions.checkState(getResolvedPath() != null);                                   
496     if (getResolvedPath().destTable() != null &&                                           
497         !(getResolvedPath().destTable() instanceof FeFsTable)) {                           
498       analyzer.addWarning("Table hints only supported for Hdfs tables");                   
499     }


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@543
PS3, Line 543: // This hint is only valid for hdfs table
nit. This comment probably should be moved to line 547.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@331
PS3, Line 331:  private long hdfsNumRowsHint_ = -1;
I think we probably should move it to ScanNode with the name numRowsHints_.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 08 Dec 2021 17:44:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9908/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sat, 11 Dec 2021 09:24:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9929/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 15 Dec 2021 08:12:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

(7 comments)

This patch looks pretty good for me! I hope we can reduce the test files by not showing the verbose plan. It's a little large for me to go through all the test cases.

http://gerrit.cloudera.org:8080/#/c/18023/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/9//COMMIT_MSG@14
PS9, Line 14: HDFS_NUM_ROWS
TABLE_NUM_ROWS?


http://gerrit.cloudera.org:8080/#/c/18023/9//COMMIT_MSG@15
PS9, Line 15: HDFS_NUM_ROWS
TABLE_NUM_ROWS?


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/cup/sql-parser.cup@3772
PS9, Line 3772:     if (p instanceof Predicate) {
              :       if (hint != null) {
nit: "if (p instanceof Predicate && hint != null)"


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@547
PS9, Line 547:       } else if (hint.is("TABLE_NUM_ROWS")) {
The above hints are all limited to hdfs tables since they are related to hdfs file/replica. But it's not clear to me why we can't support this hint for kudu tables.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5074
PS9, Line 5074:     AnalysisError("select * from t1 where (a>1 and b<1) /* +SELECTIVITY(0.1) */",
Isn't this supported now in patch set 9?


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1294
PS9, Line 1294:     options.setExplain_level(TExplainLevel.VERBOSE);
Why do we need verbose plan? Is there anything else we want to check except cardinality?
I don't see this in similar tests: https://github.com/apache/impala/blob/c0b0875bda59771fb1b5c55a5eaf45f3dcfaa63c/fe/src/test/java/org/apache/impala/planner/PlannerTest.java#L72-L73


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1295
PS9, Line 1295:     runPlannerTestFile("hdfs-cardinality-hint", options);
I think we need PlannerTestOption.VALIDATE_CARDINALITY here.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Tue, 02 Aug 2022 02:56:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 8:

> "Besides, we can not modify like this directly yet:
 > | predicate:p selectivity_hint:h
 > {:
 > if (hint != null) {
 > p.setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
 > }
 > RESULT = p;
 > :}
 > Since 'predicate' is an 'Expr', instead of 'Predicate'.
 > 'setSelectivityHint' method is belong to 'Predicate'. Here is
 > import statement:
 > nonterminal Expr predicate, bool_test_expr;"
 > 
 > I think we should verify p to be a Predicate first.
 > 
 > If there exists a hint {
 > If (p instanceof Predicate) {  ((Predicate)p).setSelectivityHint(Double.valueOf(hint.getArgs().get(0)));
 > } else {
 > // throw an error
 > }
 > }

Maybe I can trytrywang this. Here is another question I want to discuss with you. If we support set selectivity hint for compound predicate.

For 'a=1 /* +SELECTIVITY(0.1) */', this selectivity is definately belong to 'a=1'
For 'a=1 and (b=2 /* +SELECTIVITY(0.1) */)', this selectivity is definately belong to 'b=2'
For '(a=1 and b=2) /* +SELECTIVITY(0.1) */', this selectivity is definately belong to 'a=1 and b=2'

But for these cases:
1. For 'a=1 and b=2 /* +SELECTIVITY(0.1) */'
2. For '(a=1 and b=2 /* +SELECTIVITY(0.1) */)'
3. For '(a=1) and (b=2) /* +SELECTIVITY(0.1) */
Selectivity should belong to which predicate? 'b=2' or 'a=1 and b=2'?


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Fri, 25 Mar 2022 13:16:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 375 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/13
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 13:

It seem that this patch lead to some test cases failed. I will fix as soon as possible!


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 02:18:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 415 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/17
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 17
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 415 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/16
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 16
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9205/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 06 Apr 2023 07:50:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/29/fe/src/main/java/org/apache/impala/analysis/Expr.java@1064
PS29, Line 1064:     if (! predicateHintValid(this)) {
> Hi Kurt, I think you are right. I will discuss with other people to find a 
Back to earlier suggestion, it's ok to add a boolean flag here to or else a new function to collect expressions. Just make separate calls for applying hints don't change what is logically collected in other cases. If you need to propagate anything through the expression hierarchy, a visitor function would be more appropriate as collector will collect into a flat vector.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 29
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 12 Apr 2023 14:31:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 33: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 33
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 13 Apr 2023 19:05:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 21:00:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 23:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/23//COMMIT_MSG@26
PS23, Line 26: braket
nit: brackets


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java@38
PS23, Line 38:   protected double selectivityHint_;
             :   // true if the selectivity hint is set to a valid value.
             :   protected boolean hasValidSelectivityHint_;
From my point of view, it's not necessary to use the redundant variable 'hasValidSelectivityHint_' to cache the result for a performance reason, and I suggest to replace it with a just-in-time calculate function.


http://gerrit.cloudera.org:8080/#/c/18023/23/fe/src/main/java/org/apache/impala/analysis/Predicate.java@183
PS23, Line 183:   public boolean selectivityValidHintSet() {
nit: rename to hasValidSelectivityHint() ?


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5144
PS21, Line 5144: >
nit: add space on both sides of the '>'. same the followings.


http://gerrit.cloudera.org:8080/#/c/18023/21/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5181
PS21, Line 5181: >
nit: same as above.


http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/21/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@161
PS21, Line 161: ====
Does there need a compound 'OR' predicate without hint test case?



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Sun, 26 Mar 2023 12:05:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors.

This patch adds a new query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

The parser will interpret expressions wrapped in () followed by a
C-style /* comment */ as a predicate hint. The predicate hint currently
supports +SELECTIVITY(f) where 'f' is a positive floating point number
to use as the selectivity for the preceding expression.

Single predicate example:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Compound Predicate Example:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 446 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/26
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 26
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has removed Yifan Zhang from this change.  ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Removed reviewer Yifan Zhang.
-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18023/28/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18023/28/fe/src/main/java/org/apache/impala/analysis/Expr.java@1065
PS28, Line 1065:     if (allowConjunctsFromChild(this)) {
This still has not addressed the concern that getConjuncts() which is a generic method for collecting conjuncts and the behavior should not change based on a hint. Probably best to add a new function that returns only the local conjuncts and call that with the appropriate hint checking logic inline in the caller so the context and intent is all clear.



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 05 Apr 2023 14:11:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 28:

Please wait to merge this until the concerns are resolved.


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 28
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 05 Apr 2023 14:12:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "wangsheng (Code Review)" <ge...@cloudera.org>.
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 34:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@12
PS33, Line 12:  to help specify a
             : sele
> nit. remove and replace with "to help specify"
Done


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@18
PS33, Line 18: ,
> , in the range of [0, 1],
Done


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@29
PS33, Line 29: As a limitation of this path, the selectivity hints for 'AND' compound
             : predicates, either in the original SQL query or
> As a limitation of this path, the selectivity hints for 'AND' compound pred
Done


http://gerrit.cloudera.org:8080/#/c/18023/33//COMMIT_MSG@31
PS33, Line 31: are ignored. We may supported this in the near future.
> May file a new JIRA and mention the JIRA number here.
I didn't create a new JIRA yet, so not JIRA number provided here.


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@153
PS33, Line 153:         // 'AND' compound predicates will be replaced by children in Expr#getConjuncts,
> nit be
Done


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@154
PS33, Line 154:         // so selectivity hint will be missing, we add a warning here.
> nit be
Done


http://gerrit.cloudera.org:8080/#/c/18023/33/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@155
PS33, Line 155:         analyzer.addWarning("Selectivity hints are ignored for 'AND' compound "
> There is one case in CaseExpr.java where the translation to AND predicate t
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 34
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 19 Apr 2023 13:02:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 31:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9220/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 31
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 12 Apr 2023 16:01:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 31: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9220/


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 31
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Wed, 12 Apr 2023 21:14:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate selectivities
......................................................................


Patch Set 33:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9224/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 33
Gerrit-Owner: wangsheng <sk...@163.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: wangsheng <sk...@163.com>
Gerrit-Comment-Date: Thu, 13 Apr 2023 13:45:14 +0000
Gerrit-HasComments: No