You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Anonymous Coward (Code Review)" <ge...@cloudera.org> on 2022/07/20 07:29:10 UTC

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

lipenglin@sensorsdata.cn has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18760


Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................

IMPALA-11446: Push-down NOT_IN predicate to iceberg

Because the column value bounds of the Iceberg meta are not necessarily
a min or max value, NOT_IN cannot be answered using them.
NOT_IN(col, {X, ...}) with (X, Y) doesn't guarantee that X is a value
in col. But it works when the push-down column is the partition column,
it's still very helpful.

Testing:
  - add e2e tests

Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
---
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test
2 files changed, 31 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/18760/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11018/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Fri, 22 Jul 2022 10:06:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10994/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jul 2022 07:50:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................

IMPALA-11446: Push-down NOT_IN predicate to iceberg

Because the column value bounds of the Iceberg meta are not necessarily
a min or max value, NOT_IN cannot be answered using them.
NOT_IN(col, {X, ...}) with (X, Y) doesn't guarantee that X is a value
in col. But it works when the push-down column is the partition column,
it's still very helpful.

Testing:
  - add e2e tests

Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Reviewed-on: http://gerrit.cloudera.org:8080/18760
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test
2 files changed, 34 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 2:

Thanks for working on this!

I think it is great to push down the 'not in' predicate for partition columns, however for non-partition columns I believe it would be better to throw an AnalysisException. Otherwise the user won’t know that the specified predicate skips columns in the background.


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Fri, 22 Jul 2022 08:54:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Gergely Fürnstáhl (Code Review)" <ge...@cloudera.org>.
Gergely Fürnstáhl has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 3: Code-Review+1

Thanks for working on this, LGTM!


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Mon, 25 Jul 2022 14:59:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
lipenglin@sensorsdata.cn has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 3:

That`s right, users will be really confused, like I was, why non-partition columns NOT_IN does not work. But if we throw an exception, users won't be able to use NOT_IN for non-partition columns in the WHERE condition. I think we need to let others know about this on the Impala-Iceberg document. And I'll comment this out in the code!


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Fri, 22 Jul 2022 09:45:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Tue, 26 Jul 2022 18:11:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Tue, 26 Jul 2022 13:19:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8366/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Tue, 26 Jul 2022 13:19:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 3: Code-Review+1

Indeed, makes sense. Thanks for adding an extra clarification comment! The change LGTM!


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Fri, 22 Jul 2022 10:46:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Tue, 26 Jul 2022 13:19:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11446: Push-down NOT IN predicate to iceberg

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
lipenglin@sensorsdata.cn has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18760 )

Change subject: IMPALA-11446: Push-down NOT_IN predicate to iceberg
......................................................................

IMPALA-11446: Push-down NOT_IN predicate to iceberg

Because the column value bounds of the Iceberg meta are not necessarily
a min or max value, NOT_IN cannot be answered using them.
NOT_IN(col, {X, ...}) with (X, Y) doesn't guarantee that X is a value
in col. But it works when the push-down column is the partition column,
it's still very helpful.

Testing:
  - add e2e tests

Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
---
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test
2 files changed, 34 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/18760/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8bdaf6f31a4438e11c4eb27485bb413fe6df9a3
Gerrit-Change-Number: 18760
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <li...@sensorsdata.cn>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>