You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Xianqing He (Code Review)" <ge...@cloudera.org> on 2021/01/03 13:38:30 UTC

[Impala-ASF-CR] IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Xianqing He has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16915


Change subject: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
......................................................................

IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Before the conjuncts with empty tuple id will migrate to the first
child of the joins. But for full outer join or right outer joins,
it may produce the wrong result.

E.g. if 'rand()=2' migrates to the first of full join will still
produce rows.

For full outer join and right outer join, the predicate with empty
tuple id at least migrate to left child, right child to make sure
the results are correct.

Tests:
  * Add planner tests

Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
3 files changed, 238 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/16915/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
Gerrit-Change-Number: 16915
Gerrit-PatchSet: 1
Gerrit-Owner: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16915 )

Change subject: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7929/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
Gerrit-Change-Number: 16915
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 03 Jan 2021 14:17:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16915 )

Change subject: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
......................................................................

IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Before the conjuncts with empty tuple id will migrate to the first
child of the joins. But for full outer join or right outer joins,
it may produce the wrong result.

E.g. if 'rand()=2' migrates to the first of full join will still
produce rows.

For full outer join and right outer join, the predicate with empty
tuple id at least migrate to left child, right child to make sure
the results are correct.

Tests:
  * Add planner tests

Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
3 files changed, 244 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/16915/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
Gerrit-Change-Number: 16915
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16915 )

Change subject: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
......................................................................


Patch Set 2:

(7 comments)

Thanks for the patch ! I have been meaning to review it sooner. I left some comments about the plan correctness in the JIRA. It might be easier to discuss it there.

http://gerrit.cloudera.org:8080/#/c/16915/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16915/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
The 'empty tuple id' is an internal detail of the implementation.  Since this issue can occur for any non-deterministic predicate such as rand(),  can we call it 'Fix the predicate pushdown for non-deterministic predicates for outer joins'


http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2655
PS2, Line 2655: unssigned
typo: unssigned-> unassigned


http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2655
PS2, Line 2655: can reassigned
nit: 'can be'


http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2655
PS2, Line 2655: node
nit: 'nodes'


http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16915/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1767
PS2, Line 1767:     if (analyzer.isOuterJoined(tid)) {
Could you move this check into a separate Analyzer method ?


http://gerrit.cloudera.org:8080/#/c/16915/2/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16915/2/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1608
PS2, Line 1608: |     predicates: rand() = 2
I think for non-deterministic functions such as rand(), it would be wrong to push to both the scans in case of the Full OJ.  I think it should only be evaluated by the Hash Join above.  I left some comments in the JIRA instead of re-typing it here.


http://gerrit.cloudera.org:8080/#/c/16915/2/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1744
PS2, Line 1744: |     predicates: rand() = 2
For ROJ or even for LOJ, for reasons mentioned above,  it would be wrong to push to both sides of the join.  Either we push only to 1 side (the non-null producing side) or keep it at the outer join node as other predicate.  Specifically for this query, since ROJ will produce non-null rows from the right child, we should only push to the right child which is the 00:SCAN HDFS node.



-- 
To view, visit http://gerrit.cloudera.org:8080/16915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
Gerrit-Change-Number: 16915
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 20 Feb 2021 23:51:30 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9356: fix the predicate with empty tuple id invalid pushed down

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16915 )

Change subject: IMPALA-9356: fix the predicate with empty tuple id invalid pushed down
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7928/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50ae71ef31ff138adeeb57008a01eb02e7bb976f
Gerrit-Change-Number: 16915
Gerrit-PatchSet: 1
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 03 Jan 2021 14:00:12 +0000
Gerrit-HasComments: No