You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Aman Sinha (Code Review)" <ge...@cloudera.org> on 2020/04/09 01:32:13 UTC

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15693


Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................

IMPALA-9620: Ensure group-by and cnf exprs are analyzed

This change initializes the SelectStmt's groupingExprs_
with the analyzed version. It also analyzes the new predicates
created by the Conjunctive Normal Form rewrite rule such that
potential consumers of this rewrite don't encounter problems.

Before this change, the SelectStmt.analyzeGroupingExprs() made
a deep copy of the original grouping exprs, then analyzed the
copy but left the original intact. This causes problems because
a rewrite rule (invoked by SelectStmt.rewriteExprs()) may try to
process the original grouping exprs and encounter INVALID_TYPE
(types are only assigned after analyze). This was the root cause
of the problem described in the JIRA. Although this was a pre-
existing behavior, it gets exposed when enable_cnf_rewrites=true.
Note that the deep-copied analyzed grouping exprs are supplied
to MultiAggregateInfo and since many operations are using
this data structure, we don't see widespread issues.

This patch fixes it and as a conservative measure, does the
analyze of new predicates in the CNF rule. (note: there are likely
other rewrite rules where explicit analyze should be done but
that is outside the scope for this issue).

Testing:
 - Added new unit tests with predicates in SELECT and GROUP BY
 - Ran 'mvn test' for the FE

Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
3 files changed, 53 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/15693/1
-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <am...@cloudera.com>

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5766/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 02:14:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................

IMPALA-9620: Ensure group-by and cnf exprs are analyzed

This change initializes the SelectStmt's groupingExprs_
with the analyzed version. It also analyzes the new predicates
created by the Conjunctive Normal Form rewrite rule such that
potential consumers of this rewrite don't encounter problems.

Before this change, the SelectStmt.analyzeGroupingExprs() made
a deep copy of the original grouping exprs, then analyzed the
copy but left the original intact. This causes problems because
a rewrite rule (invoked by SelectStmt.rewriteExprs()) may try to
process the original grouping exprs and encounter INVALID_TYPE
(types are only assigned after analyze). This was the root cause
of the problem described in the JIRA. Although this was a pre-
existing behavior, it gets exposed when enable_cnf_rewrites=true.
Note that the deep-copied analyzed grouping exprs are supplied
to MultiAggregateInfo and since many operations are using
this data structure, we don't see widespread issues.

This patch fixes it and as a conservative measure, does the
analyze of new predicates in the CNF rule. (note: there are likely
other rewrite rules where explicit analyze should be done but
that is outside the scope for this issue).

Testing:
 - Added new unit tests with predicates in SELECT and GROUP BY
 - Ran 'mvn test' for the FE

Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Reviewed-on: http://gerrit.cloudera.org:8080/15693
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
3 files changed, 53 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 4
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15693

to look at the new patch set (#2).

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................

IMPALA-9620: Ensure group-by and cnf exprs are analyzed

This change initializes the SelectStmt's groupingExprs_
with the analyzed version. It also analyzes the new predicates
created by the Conjunctive Normal Form rewrite rule such that
potential consumers of this rewrite don't encounter problems.

Before this change, the SelectStmt.analyzeGroupingExprs() made
a deep copy of the original grouping exprs, then analyzed the
copy but left the original intact. This causes problems because
a rewrite rule (invoked by SelectStmt.rewriteExprs()) may try to
process the original grouping exprs and encounter INVALID_TYPE
(types are only assigned after analyze). This was the root cause
of the problem described in the JIRA. Although this was a pre-
existing behavior, it gets exposed when enable_cnf_rewrites=true.
Note that the deep-copied analyzed grouping exprs are supplied
to MultiAggregateInfo and since many operations are using
this data structure, we don't see widespread issues.

This patch fixes it and as a conservative measure, does the
analyze of new predicates in the CNF rule. (note: there are likely
other rewrite rules where explicit analyze should be done but
that is outside the scope for this issue).

Testing:
 - Added new unit tests with predicates in SELECT and GROUP BY
 - Ran 'mvn test' for the FE

Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
3 files changed, 53 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/15693/2
-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 17:03:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 1: Code-Review+2

(2 comments)

The fix makes sense to me.

http://gerrit.cloudera.org:8080/#/c/15693/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15693/1//COMMIT_MSG@16
PS1, Line 16: This causes problems because
            : a rewrite rule (invoked by SelectStmt.rewriteExprs()) may try to
            : process the original grouping exprs and encounter INVALID_TYPE
            : (types are only assigned after analyze)
Is ConvertToCNFRule the only rewrite rule that requires Exprs are analyzed?


http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@690
PS1, Line 690: in case we need to print them later
nit: Looks like we only print them at line 700. Maybe we can make this comment more concrete.



-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 05:58:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5771/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 07:52:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15693/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15693/1//COMMIT_MSG@16
PS1, Line 16: This causes problems because
            : a rewrite rule (invoked by SelectStmt.rewriteExprs()) may try to
            : process the original grouping exprs and encounter INVALID_TYPE
            : (types are only assigned after analyze)
> Is ConvertToCNFRule the only rewrite rule that requires Exprs are analyzed?
That seems to be the case because until now we did not encounter the issue, which is a bit of a surprise to me. Ideally, we should not have INVALID_TYPE exprs lying around.


http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@690
PS1, Line 690: in case we need to print them later
> nit: Looks like we only print them at line 700. Maybe we can make this comm
ok, will update this comment.



-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 06:08:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/15693/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@160
PS1, Line 160:                                                 Analyzer analyzer) throws AnalysisException {
line too long (93 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 01:32:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5635/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 12:38:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 12:38:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9620: Ensure group-by and cnf exprs are analyzed

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15693 )

Change subject: IMPALA-9620: Ensure group-by and cnf exprs are analyzed
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

The solution looks good to me.

http://gerrit.cloudera.org:8080/#/c/15693/2/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/15693/2/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@710
PS2, Line 710:       groupingExprs_ = groupingExprsCopy_;
As offline discussed, this may affect the result of toSql(). However, it's just the difference between non-analyzed and analyzed Exprs. I gone throught all subclasses of Expr. None of their toSqlImpl() depends on analyzed results. So I think this is ok as long as no rewrites happen at this place.


http://gerrit.cloudera.org:8080/#/c/15693/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/15693/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@112
PS2, Line 112:         cpred.getIds(tids, null);
Another solution is do cpred.analyze() here if it's not analyzed. Not a blocker. We can consider this later.



-- 
To view, visit http://gerrit.cloudera.org:8080/15693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6da4a17c6e648f466ce118c4646520ff68f9878e
Gerrit-Change-Number: 15693
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 09:22:14 +0000
Gerrit-HasComments: Yes