You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org> on 2021/02/23 16:54:33 UTC

[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate

Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate
......................................................................


Patch Set 16:

(8 comments)

Thanks for your contribution. The change lgtm, I've mostly found style issues.

http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG@8
PS16, Line 8: PlanNode does not consider some factors when estimating memory,
nit: Please put a blank line after the title


http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG@23
PS16, Line 23: Datas
nit: just 'data', or 'data rows'?


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@220
PS16, Line 220: computeJoinResourceProfile
Could you please add a comment to describe the overal calculation?


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@230
PS16, Line 230:       long rhsCard = getChild(1).getCardinality();
              :       long rhsNdv = 1;
              :       for (Expr eqJoinPredicate: eqJoinConjuncts_) {
              :         long rhsPdNdv = getNdv(eqJoinPredicate.getChild(1));
              :         rhsPdNdv = Math.min(rhsPdNdv, rhsCard);
              :         if (rhsPdNdv != -1) {
              :           rhsNdv = PlanNode.checkedMultiply(rhsNdv, rhsPdNdv);
              :         }
              :       }
Could you please add some comments for the computation of rhsNdv?


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@838
PS16, Line 838: 10L * 1024L * 1024L
nit: probably we should just use MIN_HASH_TBL_MEM here


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@921
PS16, Line 921: 
              : 
              : 
              : 
              : 
nit: too many blank lines


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@1101
PS16, Line 1101: /*
nit: doc comments start with /**


http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@1143
PS16, Line 1143: isDistributedPlan
Seems like all tests use distributed plans. It'd be good to add a few tests for non distributed plans.



-- 
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 16
Gerrit-Owner: liuyao <li...@sensorsdata.cn>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 23 Feb 2021 16:54:33 +0000
Gerrit-HasComments: Yes