You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/07/08 06:11:45 UTC

[GitHub] [doris] EmmyMiao87 commented on a diff in pull request #10659: [enhancement](nereids) make aggregate works

EmmyMiao87 commented on code in PR #10659:
URL: https://github.com/apache/doris/pull/10659#discussion_r916467348


##########
fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java:
##########
@@ -1257,6 +1257,16 @@ public String forJSON(String str) {
 
     @Override
     public void finalizeImplForNereids() throws AnalysisException {
-        super.finalizeImplForNereids();
+        if (fnName.getFunction().equalsIgnoreCase("sum")) {
+            // Prevent the cast type in vector exec engine
+            Type childType = getChild(0).type.getMaxResolutionType();
+            fn = getBuiltinFunction(fnName.getFunction(), new Type[]{childType},
+                    Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
+            type = fn.getReturnType();
+        }
+    }
+
+    public void setMergeAggFn(boolean mergeAggFn) {

Review Comment:
   It seems that this value should be set when expr is translated



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/LogicalPlanAdapter.java:
##########
@@ -77,4 +77,8 @@ public void setResultExprs(List<Expr> resultExprs) {
     public void setColLabels(ArrayList<String> colLabels) {
         this.colLabels = colLabels;
     }
+
+    public String toDigest() {

Review Comment:
   // Add todo comment



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PlanTranslatorContext.java:
##########
@@ -66,32 +65,28 @@ public TupleDescriptor generateTupleDesc() {
         return descTable.createTupleDescriptor();
     }
 
-    public PlanNodeId nextNodeId() {
+    public PlanFragmentId nextFragmentId() {
+        return fragmentIdGenerator.getNextId();
+    }
+
+    public PlanNodeId nextPlanNodeId() {
         return nodeIdGenerator.getNextId();
     }
 
     public SlotDescriptor addSlotDesc(TupleDescriptor t) {
         return descTable.addSlotDescriptor(t);
     }
 
-    public SlotDescriptor addSlotDesc(TupleDescriptor t, int id) {
-        return descTable.addSlotDescriptor(t, id);
-    }
-
-    public PlanFragmentId nextFragmentId() {
-        return fragmentIdGenerator.getNextId();
-    }
-
     public void addPlanFragment(PlanFragment planFragment) {
         this.planFragmentList.add(planFragment);
     }
 
-    public void addSlotRefMapping(Expression expression, Expr expr) {
-        expressionToExecExpr.put(expression, expr);
+    public void addIdPair(ExprId exprId, SlotRef slotRef) {

Review Comment:
   ```suggestion
       public void addExprIdPair(ExprId exprId, SlotRef slotRef) {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -288,23 +333,24 @@ public PlanFragment visitPhysicalHashJoin(
             leftFragment.setPlanRoot(crossJoinNode);
             context.addPlanFragment(leftFragment);
             return leftFragment;
+        } else {
+            Expression eqJoinExpression = physicalHashJoin.getCondition().get();
+            List<Expr> execEqConjunctList = ExpressionUtils.extractConjunct(eqJoinExpression).stream()
+                    .map(EqualTo.class::cast)
+                    .map(e -> swapEqualToForChildrenOrder(e, hashJoin.left().getOutput()))
+                    .map(e -> ExpressionTranslator.translate(e, context))
+                    .collect(Collectors.toList());
+
+            HashJoinNode hashJoinNode = new HashJoinNode(context.nextPlanNodeId(), leftFragmentPlanRoot,
+                    rightFragmentPlanRoot,
+                    JoinType.toJoinOperator(physicalHashJoin.getJoinType()), execEqConjunctList, Lists.newArrayList());
+
+            hashJoinNode.setDistributionMode(DistributionMode.BROADCAST);

Review Comment:
   Can plannode also make a finalizeForNereids function to uniformly process some properties that need to be set?



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -263,23 +313,18 @@ public PlanFragment visitPhysicalHashJoin(
         // NOTICE: We must visit from right to left, to ensure the last fragment is root fragment
         PlanFragment rightFragment = visit(hashJoin.child(1), context);
         PlanFragment leftFragment = visit(hashJoin.child(0), context);
-        PhysicalHashJoin physicalHashJoin = hashJoin.getOperator();
-
-        //        Expression predicateExpr = physicalHashJoin.getCondition().get();
-        //        List<Expression> eqExprList = Utils.getEqConjuncts(hashJoin.child(0).getOutput(),
-        //                hashJoin.child(1).getOutput(), predicateExpr);
-        JoinType joinType = physicalHashJoin.getJoinType();
-
         PlanNode leftFragmentPlanRoot = leftFragment.getPlanRoot();
         PlanNode rightFragmentPlanRoot = rightFragment.getPlanRoot();
+        PhysicalHashJoin physicalHashJoin = hashJoin.getOperator();
+        JoinType joinType = physicalHashJoin.getJoinType();
 
         if (joinType.equals(JoinType.CROSS_JOIN)

Review Comment:
   Cross join can only use the physical operator nestedloopjoin, so this situation should be handled in visitPhysicalNestedLoopJoin, not here.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundFunction.java:
##########
@@ -52,6 +54,14 @@ public List<Expression> getArguments() {
         return children();
     }
 
+    @Override
+    public String sql() throws UnboundException {

Review Comment:
   ```suggestion
       public String toSql() throws UnboundException {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -112,62 +131,92 @@ public PlanFragment visit(Plan plan, PlanTranslatorContext context) {
 
     /**
      * Translate Agg.
+     * todo: support DISTINCT
      */
     @Override
-    public PlanFragment visitPhysicalAggregation(
-            PhysicalUnaryPlan<PhysicalAggregation, Plan> agg, PlanTranslatorContext context) {
-
-        PlanFragment inputPlanFragment = visit(agg.child(0), context);
-
-        AggregationNode aggregationNode;
-        List<Slot> slotList = new ArrayList<>();
-        PhysicalAggregation physicalAggregation = agg.getOperator();
-        AggregateInfo.AggPhase phase = physicalAggregation.getAggPhase().toExec();
-
-        List<Expression> groupByExpressionList = physicalAggregation.getGroupByExprList();
+    public PlanFragment visitPhysicalAggregate(
+            PhysicalUnaryPlan<PhysicalAggregate, Plan> aggregate, PlanTranslatorContext context) {
+        PlanFragment inputPlanFragment = visit(aggregate.child(0), context);
+        PhysicalAggregate physicalAggregate = aggregate.getOperator();
+
+        // TODO: stale planner generate aggregate tuple in a special way. tuple include 2 parts:
+        //    1. group by expressions: removing duplicate expressions add to tuple
+        //    2. agg functions: only removing duplicate agg functions in output expression should appear in tuple.
+        //       e.g. select sum(v1) + 1, sum(v1) + 2 from t1 should only generate one sum(v1) in tuple
+        //    We need:
+        //    1. add a project after agg, if agg function is not the top output expression.
+        //    2. introduce canonicalized, semanticEquals and deterministic in Expression
+        //       for removing duplicate.
+        List<Expression> groupByExpressionList = physicalAggregate.getGroupByExprList();
+        List<NamedExpression> outputExpressionList = physicalAggregate.getOutputExpressionList();
+
+        // 1. generate slot reference for each group expression
+        List<SlotReference> groupSlotList = Lists.newArrayList();
+        for (Expression e : groupByExpressionList) {
+            if (e instanceof SlotReference && outputExpressionList.stream().anyMatch(o -> o.anyMatch(e::equals))) {
+                groupSlotList.add((SlotReference) e);
+            } else {
+                groupSlotList.add(new SlotReference(e.sql(), e.getDataType(), e.nullable(), Collections.emptyList()));
+            }
+        }
         ArrayList<Expr> execGroupingExpressions = groupByExpressionList.stream()
-                // Since output of plan doesn't contain the slots of groupBy, which is actually needed by
-                // the BE execution, so we have to collect them and add to the slotList to generate corresponding
-                // TupleDesc.
-                .peek(x -> slotList.addAll(x.collect(SlotReference.class::isInstance)))
                 .map(e -> ExpressionTranslator.translate(e, context)).collect(Collectors.toCollection(ArrayList::new));
-        slotList.addAll(agg.getOutput());
-        TupleDescriptor outputTupleDesc = generateTupleDesc(slotList, context, null);
-
-        List<NamedExpression> outputExpressionList = physicalAggregation.getOutputExpressionList();
-        ArrayList<FunctionCallExpr> execAggExpressions = outputExpressionList.stream()
-                .map(e -> e.<List<AggregateFunction>>collect(AggregateFunction.class::isInstance))
+        // 2. collect agg functions and generate agg function to slot reference map

Review Comment:
   What if the agg function does not appear in output but in having?



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/ExpressionTranslator.java:
##########
@@ -122,6 +123,13 @@ public Expr visitLessThanEqual(LessThanEqual lessThanEqual, PlanTranslatorContex
                 lessThanEqual.child(1).accept(this, context));
     }
 
+    @Override
+    public Expr visitNullSafeEqual(NullSafeEqual nullSafeEqual, PlanTranslatorContext context) {
+        return new BinaryPredicate(Operator.EQ_FOR_NULL,

Review Comment:
   Use `NullSafeEqual` instead



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/operators/plans/logical/LogicalAggregate.java:
##########
@@ -127,14 +129,19 @@ public boolean equals(Object o) {
             return false;
         }
         LogicalAggregate that = (LogicalAggregate) o;
-        return Objects.equals(groupByExprList, that.groupByExprList)
+        return Objects.equals(groupByExpressionList, that.groupByExpressionList)
                 && Objects.equals(outputExpressionList, that.outputExpressionList)
                 && Objects.equals(partitionExprList, that.partitionExprList)
                 && aggPhase == that.aggPhase;
     }
 
     @Override
     public int hashCode() {
-        return Objects.hash(groupByExprList, outputExpressionList, partitionExprList, aggPhase);
+        return Objects.hash(groupByExpressionList, outputExpressionList, partitionExprList, aggPhase);
+    }
+
+    public LogicalAggregate withGroupByAndOutput(List<Expression> groupByExprList,

Review Comment:
   What's different between `withGroupByAndOutput` function and `LogicalAggregate` constructor on Line57



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org