You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "liuyao (Code Review)" <ge...@cloudera.org> on 2021/06/25 09:35:58 UTC

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

liuyao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17637


Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Tesing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
6 files changed, 70 insertions(+), 55 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 1
Gerrit-Owner: liuyao <li...@sensorsdata.cn>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17637

to look at the new patch set (#4).

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Tesing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test
32 files changed, 1,198 insertions(+), 1,185 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 4
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@250
PS5, Line 250: rChildIsNull
> nit. Do we need to worry about left child is null here, or not since the le
Considering this special situation, the selectivity will be more accurate.


http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@280
PS5, Line 280: // For =, !=, "is distinct from null" and "is not distinct from non-null",
             :           // all null values are false.
             :           selectivity_ *= (double) (numRows - numNulls) / numRows;
> nit. may combine them into a single sentence:
Done


http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@284
PS5, Line 284: stinct from null, only null values are
> Another case here is "is distinct from not-null".
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Mon, 12 Jul 2021 08:40:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7267/


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 2
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Jun 2021 16:02:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
Hello Aman Sinha, Qifan Chen, Zoltan Borok-Nagy, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17637

to look at the new patch set (#5).

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Testing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test
32 files changed, 1,198 insertions(+), 1,185 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 5
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7272/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 4
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 29 Jun 2021 03:07:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@288
PS6, Line 288:  selectivity_ = selectivity_ * (double) (numRows - numNulls) / numRows +
             :               numNulls / numRows;
> This calculation formula may be (1- 1/ndv)* (numRows - numNulls) / numRows 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 7
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Tue, 13 Jul 2021 13:26:51 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@288
PS6, Line 288:  selectivity_ = selectivity_ * (double) (numRows - numNulls) / numRows +
             :               numNulls / numRows;
nit. In this case, can we use selectivity = #nulls / #rows directly?



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Mon, 12 Jul 2021 17:28:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9006/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 1
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Jun 2021 09:58:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Testing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Reviewed-on: http://gerrit.cloudera.org:8080/17637
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Qifan Chen <qc...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test
32 files changed, 1,203 insertions(+), 1,186 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Qifan Chen: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 8
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7285/


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Mon, 12 Jul 2021 14:46:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9018/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 3
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 28 Jun 2021 08:02:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7285/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Mon, 12 Jul 2021 08:43:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7267/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 2
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Jun 2021 09:59:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9021/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 4
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 29 Jun 2021 03:24:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 5:

(3 comments)

Looks very good. Thanks a lot for the follow-up work!

http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@250
PS5, Line 250: rChildIsNull
nit. Do we need to worry about left child is null here, or not since the left child is always a column reference?


http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@280
PS5, Line 280: // For = and !=, all null values are false
             :           // For is distinct from null, all null values are false
             :           // For is not distinct from non-null, all null values are false
nit. may combine them into a single sentence:

For =, !=, "is distinct from null" and "is not distinct from non-null", all null values must be excluded.


http://gerrit.cloudera.org:8080/#/c/17637/5/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@284
PS5, Line 284: Operator.NOT_DISTINCT && rChildIsNull)
Another case here is "is distinct from not-null".



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 5
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Fri, 09 Jul 2021 17:55:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9061/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 5
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Fri, 09 Jul 2021 10:10:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9072/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Mon, 12 Jul 2021 09:01:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 7
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Tue, 13 Jul 2021 08:43:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17637

to look at the new patch set (#3).

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Tesing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test
32 files changed, 1,188 insertions(+), 1,173 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 3
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7271/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 3
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 28 Jun 2021 07:40:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 4
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 29 Jun 2021 09:04:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7294/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 7
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Tue, 13 Jul 2021 13:35:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 5: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 5
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Fri, 09 Jul 2021 12:33:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17637

to look at the new patch set (#2).

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Tesing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
6 files changed, 70 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 2
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7271/


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 3
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 28 Jun 2021 13:42:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@288
PS6, Line 288:  selectivity_ = selectivity_ * (double) (numRows - numNulls) / numRows +
             :               numNulls / numRows;
> nit. In this case, can we use selectivity = #nulls / #rows directly?
This calculation formula may be (1- 1/ndv)* (numRows - numNulls) / numRows + numNulls / numRows. Some non-null values satisfy "is distinct from not-null". ie. 'aa' is distinct from 'bb'



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 7
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Tue, 13 Jul 2021 02:36:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17637/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17637/4//COMMIT_MSG@20
PS4, Line 20: Testin
> Testing
Done


http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@253
PS4, Line 253: !
> !=
Done


http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@283
PS4, Line 283:  (numRows - numNulls) / numRows;
> nit: I think it's worth to extract it to a variable just like numRows
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 5
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Fri, 09 Jul 2021 09:52:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17637/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@270
PS1, Line 270:     
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17637/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@276
PS1, Line 276:         if (op_ == Operator.EQ || op_ == Operator.NE 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17637/1/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/17637/1/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@268
PS1, Line 268:     
line has trailing whitespace



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 1
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Jun 2021 09:36:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9007/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 2
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Jun 2021 10:04:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 4: Code-Review+1

(3 comments)

Found a few nits, but otherwise LGTM.

http://gerrit.cloudera.org:8080/#/c/17637/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17637/4//COMMIT_MSG@20
PS4, Line 20: Tesing
Testing


http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@253
PS4, Line 253: =
!=


http://gerrit.cloudera.org:8080/#/c/17637/4/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@283
PS4, Line 283: slotDesc.getStats().getNumNulls()
nit: I think it's worth to extract it to a variable just like numRows



-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 4
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Jul 2021 08:23:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "liuyao (Code Review)" <ge...@cloudera.org>.
Hello Aman Sinha, Qifan Chen, Zoltan Borok-Nagy, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17637

to look at the new patch set (#6).

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................

IMPALA-10766: Better selectivity for =,not distinct

For = :
If the right side is null, then selectivity is 0.
If the left side is null, null should be excluded when calculating
selectivity.

For is not distinct from :
If the right side is null, non null should be excluded when calculating
selectivity, and only null should be included.
If the left side is null and the right side is not null, null should be
excluded when calculating selectivity, including part of non-null.

Testing :
Change the UT, modify the selectivity calculation error, add two new
cases column != null and column = null

Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test
32 files changed, 1,203 insertions(+), 1,186 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17637/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 6
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>

[Impala-ASF-CR] IMPALA-10766: Better selectivity for =,not distinct

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )

Change subject: IMPALA-10766: Better selectivity for =,not distinct
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7292/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60
Gerrit-Change-Number: 17637
Gerrit-PatchSet: 7
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: liuyao <li...@sensorsdata.cn>
Gerrit-Comment-Date: Tue, 13 Jul 2021 02:37:30 +0000
Gerrit-HasComments: No