You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/04/11 16:42:53 UTC

[GitHub] [drill] amansinha100 commented on a change in pull request #1744: Drill 7148 - Join order, multi-col ndv and aggregate rowcount fixes for TPCH queries

amansinha100 commented on a change in pull request #1744: Drill 7148 - Join order, multi-col ndv and aggregate rowcount fixes for TPCH queries
URL: https://github.com/apache/drill/pull/1744#discussion_r274498821
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdRowCount.java
 ##########
 @@ -53,7 +53,21 @@ public Double getRowCount(Aggregate rel, RelMetadataQuery mq) {
 
     if (groupKey.isEmpty()) {
       return 1.0;
-    } else {
+    } else if (rel instanceof AggPrelBase &&
+            ((AggPrelBase) rel).getOperatorPhase() == AggPrelBase.OperatorPhase.PHASE_1of2) {
+      // Phase 1 Aggregate would return rows in the range [NDV, input_rows]. Hence, use the
+      // existing estimate of 1/10 * input_rows
+        Double distinctRowCount = mq.getRowCount(rel.getInput()) / 10;
 
 Review comment:
   Can we not get this default value from an existing Calcite utility ?  This will allow keeping in sync in the future if the ratio changes. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services