You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Alex Behm (Code Review)" <ge...@cloudera.org> on 2016/10/23 19:28:08 UTC

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4815

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................

IMPALA-4336: Cast exprs after unnesting union operands.

The bug was that we cast the result exprs of operands before
unnesting them. If we unnested an operands, casts were missing
on those unnested operands' result exprs.

The fix is to first unnest operands and then cast result exprs.

Also clarifies the use of resultExprs vs. baseTblResultExprs
in unions. We should consistently use resultExprs because the
final expr substitution happens during planning.
The only place where baseTblResultExprs should be used is in
UnionStmt.materializeRequiredSlots() because that is called
before plan generation and we need to mark the slots of resolved
exprs as materialized.

Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M testdata/workloads/functional-query/queries/QueryTest/union.test
5 files changed, 111 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4815/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 1:

(8 comments)

Flushing some initial comments. Still need to wrap my head around all the moving parts.

http://gerrit.cloudera.org:8080/#/c/4815/1//COMMIT_MSG
Commit Message:

PS1, Line 15: Also clarifies the use of resultExprs vs. baseTblResultExprs
            : in unions. We should consistently use resultExprs because the
            : final expr substitution happens during planning.
            : The only place where baseTblResultExprs should be used is in
            : UnionStmt.materializeRequiredSlots() because that is called
            : before plan generation and we need to mark the slots of resolved
            : exprs as materialized.
I am not sure this belongs to the commit message. If you don't have it already, you may want to put it as a comment somewhere in the code.


http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

PS1, Line 202: Propagates DISTINCT from left to right, and checks that all union operands are
             :    * are union compatible, adding implicit casts if necessary.
I think you need to update this comment. This function seems to be doing a lot more than simply propagating distinct and checking union operand compatibility.


PS1, Line 203: are
remove


PS1, Line 218: // Analyze all operands and make sure they return an equal number of exprs.
             :     for (int i = 0; i < operands_.size(); ++i) {
             :       try {
             :         operands_.get(i).analyze(analyzer);
             :         QueryStmt firstQuery = operands_.get(0).getQueryStmt();
             :         List<Expr> firstExprs = operands_.get(0).getQueryStmt().getResultExprs();
             :         QueryStmt query = operands_.get(i).getQueryStmt();
             :         List<Expr> exprs = query.getResultExprs();
             :         if (firstExprs.size() != exprs.size()) {
             :           throw new AnalysisException("Operands have unequal number of columns:\n" +
             :               "'" + queryStmtToSql(firstQuery) + "' has " +
             :               firstExprs.size() + " column(s)\n" +
             :               "'" + queryStmtToSql(query) + "' has " + exprs.size() + " column(s)");
             :         }
             :       } catch (AnalysisException e) {
             :         if (analyzer.getMissingTbls().isEmpty()) throw e;
             :       }
             :     }
This function has grown a bit too much. I would put this block is a separate function called analyzeOperands().


Line 241:     // Remember SQL string before unnesting operands.
the


PS1, Line 258: // Cast all result exprs to a compatible type.
move above L263


PS1, Line 282: // Should never happen
             :         throw new AnalysisException("Error creating agg info in UnionStmt.analyze()");
Maybe convert it to Preconditions.checkState(false, "Error....")?


http://gerrit.cloudera.org:8080/#/c/4815/1/testdata/workloads/functional-query/queries/QueryTest/union.test
File testdata/workloads/functional-query/queries/QueryTest/union.test:

PS1, Line 1030: (select 80 union all select 90)
Maybe add a test case with nested union distinct, if not already there.


-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


IMPALA-4336: Cast exprs after unnesting union operands.

The bug was that we cast the result exprs of operands before
unnesting them. If we unnested an operands, casts were missing
on those unnested operands' result exprs.

The fix is to first unnest operands and then cast result exprs.

Also clarifies the use of resultExprs vs. baseTblResultExprs.

Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Reviewed-on: http://gerrit.cloudera.org:8080/4815
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Internal Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M testdata/workloads/functional-query/queries/QueryTest/union.test
5 files changed, 137 insertions(+), 104 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4815/1//COMMIT_MSG
Commit Message:

PS1, Line 15: Also clarifies the use of resultExprs vs. baseTblResultExprs
            : in unions. We should consistently use resultExprs because the
            : final expr substitution happens during planning.
            : The only place where baseTblResultExprs should be used is in
            : UnionStmt.materializeRequiredSlots() because that is called
            : before plan generation and we need to mark the slots of resolved
            : exprs as materialized.
> I am not sure this belongs to the commit message. If you don't have it alre
Shrunk comment here. Moved expanded comment to UnionStmt class comment.


http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

PS1, Line 202: Propagates DISTINCT from left to right, and checks that all union operands are
             :    * are union compatible, adding implicit casts if necessary.
> I think you need to update this comment. This function seems to be doing a 
I removed this comment because it's redundant with the class comment. Let me know if you feel the class comment still needs to be expanded/moved.


PS1, Line 203: are
> remove
Done


PS1, Line 218: // Analyze all operands and make sure they return an equal number of exprs.
             :     for (int i = 0; i < operands_.size(); ++i) {
             :       try {
             :         operands_.get(i).analyze(analyzer);
             :         QueryStmt firstQuery = operands_.get(0).getQueryStmt();
             :         List<Expr> firstExprs = operands_.get(0).getQueryStmt().getResultExprs();
             :         QueryStmt query = operands_.get(i).getQueryStmt();
             :         List<Expr> exprs = query.getResultExprs();
             :         if (firstExprs.size() != exprs.size()) {
             :           throw new AnalysisException("Operands have unequal number of columns:\n" +
             :               "'" + queryStmtToSql(firstQuery) + "' has " +
             :               firstExprs.size() + " column(s)\n" +
             :               "'" + queryStmtToSql(query) + "' has " + exprs.size() + " column(s)");
             :         }
             :       } catch (AnalysisException e) {
             :         if (analyzer.getMissingTbls().isEmpty()) throw e;
             :       }
             :     }
> This function has grown a bit too much. I would put this block is a separat
Done


Line 241:     // Remember SQL string before unnesting operands.
> the
Done


PS1, Line 258: // Cast all result exprs to a compatible type.
> move above L263
We need to collect them first before we can analyzer.castToUnionCompatibleTypes(), so seems relevant to this loop as well. Modified comment.


PS1, Line 282: // Should never happen
             :         throw new AnalysisException("Error creating agg info in UnionStmt.analyze()");
> Maybe convert it to Preconditions.checkState(false, "Error....")?
Modified to an ISE which is what the Preconditions would also throw. I prefer the ISe because we can give it a cause.


http://gerrit.cloudera.org:8080/#/c/4815/1/testdata/workloads/functional-query/queries/QueryTest/union.test
File testdata/workloads/functional-query/queries/QueryTest/union.test:

PS1, Line 1030: (select 80 union all select 90)
> Maybe add a test case with nested union distinct, if not already there.
A nested union distinct cannot be unnested, so is not really related to this bug. We have tests for that elsewhere.


-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................

IMPALA-4336: Cast exprs after unnesting union operands.

The bug was that we cast the result exprs of operands before
unnesting them. If we unnested an operands, casts were missing
on those unnested operands' result exprs.

The fix is to first unnest operands and then cast result exprs.

Also clarifies the use of resultExprs vs. baseTblResultExprs.

Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M testdata/workloads/functional-query/queries/QueryTest/union.test
5 files changed, 137 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4815/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>

[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
......................................................................


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

PS1, Line 282: operands_.get(i).analyze(analyzer);
             :         QueryStmt firstQuery = operands_.get(0).getQueryStmt();
> Modified to an ISE which is what the Preconditions would also throw. I pref
Sounds good to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes