You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tianyi Wang (Code Review)" <ge...@cloudera.org> on 2017/09/02 00:13:55 UTC

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Tianyi Wang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7948

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................

IMPALA-4620: Refactor evalcost computation in query analysis

This patch adds a computeEvalCost abstract function to class Expr and
call it explicitly when an expr is analyzed. Existing code sets evalcost
at random places in analyzeImpl(), causing a bug casting child expr
without recomputing evalcost. Furthermore, if a child of an Expr is
substituted from one with known evalcost to one with unkwown evalcost,
evalcost of the parent won't be updatde by subsequent analyze() calls.
This patch fixes these problems.
A planner test case with wrong predicate order is also fixed.

Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
---
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
26 files changed, 153 insertions(+), 76 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/7948/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


IMPALA-4620: Refactor evalcost computation in query analysis

This patch adds a computeEvalCost abstract function to class Expr and
call it explicitly when an expr is analyzed. Existing code sets evalcost
at random places in analyzeImpl(), causing a bug casting child expr
without recomputing evalcost. Furthermore, if a child of an Expr is
substituted from one with known evalcost to one with unkwown evalcost,
evalcost of the parent won't be updatde by subsequent analyze() calls.
This patch fixes these problems.
A planner test case with wrong predicate order is also fixed.

Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Reviewed-on: http://gerrit.cloudera.org:8080/7948
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
26 files changed, 117 insertions(+), 62 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Tianyi Wang (Code Review)" <ge...@cloudera.org>.
Tianyi Wang has uploaded a new patch set (#2).

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................

IMPALA-4620: Refactor evalcost computation in query analysis

This patch adds a computeEvalCost abstract function to class Expr and
call it explicitly when an expr is analyzed. Existing code sets evalcost
at random places in analyzeImpl(), causing a bug casting child expr
without recomputing evalcost. Furthermore, if a child of an Expr is
substituted from one with known evalcost to one with unkwown evalcost,
evalcost of the parent won't be updatde by subsequent analyze() calls.
This patch fixes these problems.
A planner test case with wrong predicate order is also fixed.

Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
---
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
26 files changed, 117 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/7948/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7948/2/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

Line 240:           (float) (getAvgStringLength(getChild(0)) + getAvgStringLength(getChild(1))) *
> The original code here (with different parenthesis position) doesn't seem c
Good point.


http://gerrit.cloudera.org:8080/#/c/7948/1/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
File testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test:

Line 218: |     predicates: functional.alltypessmall.string_col = '15', functional.alltypessmall.id + 15 = 27
> string_col = '15':
Thanks for clarifying and nice to hear that this patch fixed a few bugs!


-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Tianyi Wang (Code Review)" <ge...@cloudera.org>.
Tianyi Wang has posted comments on this change.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java:

Line 556:   protected float computeEvalCost() { return UNKNOWN_COST; }
> single line (and elsewhere)
Done


http://gerrit.cloudera.org:8080/#/c/7948/2/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

Line 240:           (float) (getAvgStringLength(getChild(0)) + getAvgStringLength(getChild(1))) *
The original code here (with different parenthesis position) doesn't seem correct.


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
File fe/src/main/java/org/apache/impala/analysis/CaseExpr.java:

Line 352:     if (!hasChildCosts()) return UNKNOWN_COST;
> Prefer this to avoid code indentation and fewer lines:
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

Line 68:       computeNumDistinctValues();
> move after computeNumDistinctValues() for consistency
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 247:   protected float evalCost_;
> We should be able to make this private now.
Not really. We need to initialize it in SlotRef, CastExpr and LiteralExpr.


Line 369:    * computed. Should be called bottom-up whenever the structure of subtree is modified.
> double 'the'
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/InPredicate.java
File fe/src/main/java/org/apache/impala/analysis/InPredicate.java:

Line 191:     if (!hasChildCosts()) return UNKNOWN_COST;
> if (!hasChildCosts()) return UNKNOWN_COST;
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
File fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java:

Line 55:     if (!getChild(0).hasCost()) return UNKNOWN_COST;
> if (!getChild(0).hasCost()) return UNKNOWN_COST;
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
File fe/src/main/java/org/apache/impala/analysis/LikePredicate.java:

Line 150:     if (!hasChildCosts()) return UNKNOWN_COST;
> single line and make the else if a regular if
Done


http://gerrit.cloudera.org:8080/#/c/7948/1/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
File testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test:

Line 218: |     predicates: functional.alltypessmall.string_col = '15', functional.alltypessmall.id + 15 = 27
> The string predicate should be more expensive. Can you confirm the costs of
string_col = '15':
SLOT_REF_COST + LITERAL_COST + (AVGLEN(string_col)  + LEN('15')) * BINARY_PREDICATE_COST = 1 + 1 + (1 + 2) * 1 = 5

Note AVGLEN(string_col) = 1 in functional.alltypessmall

id + 15 = 27:
SLOT_REF_COST + CAST_COST + 2 * LITERAL_COST + ARITHMETIC_OP_COST + BINARY_PREDICATE_COST = 1 + 1 + 2 * 1 + 1 + 1 = 6

id is cast from int to bigint before adding

Without this patch the CAST_COST was not updated into its parent because the CastExpr was created somewhere in analyzeImpl(). The cost computed will be 5 instead of 6.


-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java:

Line 556:   protected float computeEvalCost() {
single line (and elsewhere)


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
File fe/src/main/java/org/apache/impala/analysis/CaseExpr.java:

Line 352:     if (hasChildCosts()) {
Prefer this to avoid code indentation and fewer lines:

if (!hasChildCosts()) return UNKNOWN_COST;


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

Line 68:       evalCost_ = computeEvalCost();
move after computeNumDistinctValues() for consistency


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 247:   protected float evalCost_;
We should be able to make this private now.


Line 369:    * computed. Should be called bottom-up whenever the the structure of subtree is
double 'the'


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/InPredicate.java
File fe/src/main/java/org/apache/impala/analysis/InPredicate.java:

Line 191:     if (hasChildCosts()) {
if (!hasChildCosts()) return UNKNOWN_COST;


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
File fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java:

Line 55:     if (getChild(0).hasCost()) {
if (!getChild(0).hasCost()) return UNKNOWN_COST;


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
File fe/src/main/java/org/apache/impala/analysis/LikePredicate.java:

Line 150:     if (!hasChildCosts()) {
single line and make the else if a regular if


http://gerrit.cloudera.org:8080/#/c/7948/1/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
File fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java:

Line 93
Thanks for cleaning this up! Pretty confusing in how many places the eval cost was set... who knows if it was necessary


http://gerrit.cloudera.org:8080/#/c/7948/1/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
File testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test:

Line 218: |     predicates: functional.alltypessmall.string_col = '15', functional.alltypessmall.id + 15 = 27
The string predicate should be more expensive. Can you confirm the costs of these predicates?

I looked at this briefly and the string predicate should be something like:

SLOT_REF_COST + BINARY_PREDICTAE_COST + LITERAL_COST + AVG(sizeof(string_col)) + AVG(sizeof('15'))

that seems much higher than the other binary predicate


-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1205/

-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4620: Refactor evalcost computation in query analysis

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4620: Refactor evalcost computation in query analysis
......................................................................


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec3d648532c185d4318476796b1ba432b0fe59e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-HasComments: No