You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by "rohangarg (via GitHub)" <gi...@apache.org> on 2023/05/09 06:26:46 UTC

[GitHub] [druid] rohangarg opened a new pull request, #14232: Decouple logical planning and native query generation in SQL planning

rohangarg opened a new pull request, #14232:
URL: https://github.com/apache/druid/pull/14232

This patch starts the process of decoupling logical planning and native query generation in SQL planning. Currently, the DAG optimization and the native query generation are both done simultaneously in the planner via calcite optimization and conversion rules.
That coupling leads to difficulty in obtaining a logical DAG from the planner if needed, as well as the debugging and logging for erroneous queries also becomes very difficult to do or comprehend.
The patch decouples the two things by allowing logical planning to happen via the calcite planner and let it generate a DAG which contains `RelNode` nodes as per a defined logical Druid convention. Post that, the DAG is converted to a native query through a visitor over the produced `RelNode` tree.
The patch is put behind a feature flag `plannerStrategy` which can have two modes `COUPLED` and `DECOUPLED`. Currently, the default mode is `COUPLED` which means that the mode of planning currently is same as before.
The new planning mode is enabled via setting the mode as `DECOUPLED`.
Further, the feature is WIP in the sense that some operators (like join/window/union) haven't been rewritten to allow planning through the new mechanism. Also, in the currently converted operators (scan/filter/project/sort/aggregate), there are a few test failures in the corner cases. Both of those things are planned to be fixed in future patches for the feature.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] abhishekagarwal87 commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.

abhishekagarwal87 commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1598096281

   Those problems are indeed relatable. I am wondering if the new design will help with some other problems that you didn't intend to solve immediately. The one thing I had in mind was plan caching. The plan generation can be costly and sometimes forces us to switch to native queries in favor of high QPS. I am wondering if that will be easier to do in this new de-coupled mode where druid conversion happens after the DAG is generated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] rohangarg commented on a diff in pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.

rohangarg commented on code in PR #14232:
URL: https://github.com/apache/druid/pull/14232#discussion_r1233920200


##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/PlannerFactory.java:
##########
@@ -163,6 +164,7 @@ private FrameworkConfig buildFrameworkConfig(PlannerContext plannerContext)
         .typeSystem(DruidTypeSystem.INSTANCE)
         .defaultSchema(rootSchema.getSubSchema(druidSchemaName))
         .sqlToRelConverterConfig(sqlToRelConverterConfig)
+        .costFactory(new DruidVolcanoCost.Factory())

Review Comment:
   Good catch, yeah the new cost model doesn't seem to affect the tests. But agree to keep this gated for now, fixed it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] abhishekagarwal87 commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.

abhishekagarwal87 commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1541451337

   Hi Rohan - can you write a separate GH proposal that adds the motivation behind these changes with a bit more details? Seems like this is the first PR towards that end goal, you have in mind. The proposal can also include a short overview of the upcoming changes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] imply-cheddar commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "imply-cheddar (via GitHub)" <gi...@apache.org>.

imply-cheddar commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1596495350

There are requests for a design doc. The description of this PR attempts to lay out the explanation (though it's very terse and doesn't actually do it in a way that is consumable to anybody who isn't involved in the work).

The basic part of this PR is that it is

1) Introducing a new way of planning SQL. This is introduced as yet-another method of planning and doesn't impact the current methods of planning
2) You can opt-in to the new method of planning via setting a context parameter, but it's also very verbose in terms of logging and has queries that are actually plannable, but the new way just doesn't yet know how to plan them

The new method "decouples" planning of native queries into a 2-phase process

1. SQL -> DAG
2. DAG -> native query

The DAG is returned from the SQL planner and then that DAG is used to build the native query.

The reason that we are doing this is to make it easier to iterate and work with SQL for Druid. Druid's native query planning follows a relatively uncommon pattern of building up the physical execution plan as the output of the volcano planner. On the plus side, this was done so that we could leverage the volcano planner to explore different methods of physical planning in case something wasn't natively plannable. On the negative side, this makes it incredibly difficult to actually work with the SQL planning as it exists in Druid. This is seen through the following symptoms:

1) There is a class of errors that it is nigh-impossible to generate good error messages for. This is because an error could just mean "volcano planner, explore something else" or it could mean "user person, you did something wrong, re-write your query". The fundamental source of these poor error codes is that we have coupled the physical execution conversion with the planning, but the current code tries to work around this by allowing you to set a possible error, which the planner will surface. This "possible error" basically means that we show users an error message which is the last error message that caused the volcano planner to stop, and might or might not actually be related to the real issue. In fact, a lot of the time, this error is not related to the actual issue at all. Instead of working around the root cause (the coupling of the planning with the building of the execution), we should decouple things
2) When a query fails to plan, but we think it should be plannable, it is extremely difficult to figure out if it is an issue with the rules (i.e. we need a rule that orders nodes differently) OR if it is an issue with the physical query planning (i.e. something in `PartialDruidQuery` and peripheral classes needs update). The "normal" debugging process for this would be, (1) is the SQL turning into a "good" DAG? If yes, figure out why that "good" DAG is not being converted, if no, figure out why the SQL is not turning into a "good" DAG. This debugging process is, in fact, what we normally follow, we just have to do it by setting a breakpoint in the volcano planner code and inspect all of the various different plans that it explores to figure out if there's a good one in there or not. Instead of needing to fire up a debugger to do the first-degree split, we should be able to do it much more directly. That is, the normal flow of the code could be SQL -> DAG -> native query, wit
h an object representing each of these., which is exactly what the decoupling is accomplishing.
3) It is very difficult to identify and understand what patterns of logical DAG are not runnable by native Druid queries. Given that the current code leverages "I cannot run that" as a method of telling the planner to explore other things, it's not abundantly clear which combinations of DAG elements lead to a failure that can be successfully planned by re-arranging things VS. which combinations of DAG elements lead to a failure that will never be converted. This makes it quite difficult to inventory the things that Druid's native queries can and cannot do when it comes to running SQL. The decoupling effort eliminates this because the first pass of SQL -> DAG should be able to happen successfully regardless of whether a Druid native query can be generated. When the generated DAG gets converted to a native query, the code that builds the native query will be able to understand and identify that it's not possible to convert that specific set of DAG elements and generate an error
that explains why. So, what happens when we get an error that says we cannot plan a specific query, but we think that it should be plannable? We follow the normal debug procedure: we evaluate the generated DAG and convince ourselves that it is a "good" DAG given that input SQL, if it was a "bad" DAG, we figure out what rules will help us generate a "good" DAG, if it is a "good" DAG, we figure out how to change the conversion logic so that it stops generating an exception.

The change itself is not impacting the current behavior and is being done as an add-on. Once we have it passing all of the existing tests, then we will be able to switch the default planning mode over to it and deprecate the old native query planning. Given that this isn't actually changing behaviors and that it's not fundamentally changing any sort of public API, I believe it should be safe to iterate on this work quite rapidly.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] github-code-scanning[bot] commented on a diff in pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "github-code-scanning[bot] (via GitHub)" <gi...@apache.org>.

github-code-scanning[bot] commented on code in PR #14232:
URL: https://github.com/apache/druid/pull/14232#discussion_r1188216381


##########
sql/src/main/java/org/apache/calcite/plan/volcano/DruidVolcanoCost.java:
##########
@@ -0,0 +1,288 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+//CHECKSTYLE.OFF: PackageName - Must be in Calcite
+
+package org.apache.calcite.plan.volcano;
+
+import org.apache.calcite.plan.RelOptCost;
+import org.apache.calcite.plan.RelOptCostFactory;
+import org.apache.calcite.plan.RelOptUtil;
+
+import java.util.Objects;
+
+/**
+ * Druid's extension to {@link VolcanoCost}. The difference between the two is in
+ * comparing two costs. Druid's cost model gives most weightage to rowCount, then to cpuCost and then lastly ioCost.
+ */
+public class DruidVolcanoCost implements RelOptCost
+{
+
+  static final DruidVolcanoCost INFINITY = new DruidVolcanoCost(
+      Double.POSITIVE_INFINITY,
+      Double.POSITIVE_INFINITY,
+      Double.POSITIVE_INFINITY
+  )
+  {
+    @Override
+    public String toString()
+    {
+      return "{inf}";
+    }
+  };
+
+  //CHECKSTYLE.OFF: Regexp
+  static final DruidVolcanoCost HUGE = new DruidVolcanoCost(Double.MAX_VALUE, Double.MAX_VALUE, Double.MAX_VALUE) {
+    @Override
+    public String toString()
+    {
+      return "{huge}";
+    }
+  };
+  //CHECKSTYLE.ON: Regexp
+
+  static final DruidVolcanoCost ZERO =
+      new DruidVolcanoCost(0.0, 0.0, 0.0)
+      {
+        @Override
+        public String toString()
+        {
+          return "{0}";
+        }
+      };
+
+  static final DruidVolcanoCost TINY =
+      new DruidVolcanoCost(1.0, 1.0, 0.0)
+      {
+        @Override
+        public String toString()
+        {
+          return "{tiny}";
+        }
+      };
+
+  public static final RelOptCostFactory FACTORY = new Factory();
+
+  final double cpu;
+  final double io;
+  final double rowCount;
+
+  DruidVolcanoCost(double rowCount, double cpu, double io)
+  {
+    this.rowCount = rowCount;
+    this.cpu = cpu;
+    this.io = io;
+  }
+
+  @Override
+  public double getCpu()
+  {
+    return cpu;
+  }
+
+  @Override
+  public boolean isInfinite()
+  {
+    return (this == INFINITY)
+           || (this.rowCount == Double.POSITIVE_INFINITY)
+           || (this.cpu == Double.POSITIVE_INFINITY)
+           || (this.io == Double.POSITIVE_INFINITY);
+  }
+
+  @Override
+  public double getIo()
+  {
+    return io;
+  }
+
+  @Override
+  public boolean isLe(RelOptCost other)
+  {
+    DruidVolcanoCost that = (DruidVolcanoCost) other;
+    return (this == that)
+           || ((this.rowCount < that.rowCount)
+               || (this.rowCount == that.rowCount && this.cpu < that.cpu)
+               || (this.rowCount == that.rowCount && this.cpu == that.cpu && this.io <= that.io));
+  }
+
+  @Override
+  public boolean isLt(RelOptCost other)
+  {
+    return isLe(other) && !equals(other);
+  }
+
+  @Override
+  public double getRows()
+  {
+    return rowCount;
+  }
+
+  @Override
+  public int hashCode()
+  {
+    return Objects.hash(rowCount, cpu, io);
+  }
+
+  @Override
+  public boolean equals(RelOptCost other)
+  {
+    return this == other
+           || other instanceof DruidVolcanoCost
+              && (this.rowCount == ((DruidVolcanoCost) other).rowCount)
+              && (this.cpu == ((DruidVolcanoCost) other).cpu)
+              && (this.io == ((DruidVolcanoCost) other).io);
+  }
+
+  @Override
+  public boolean equals(Object obj)

Review Comment:
   ## Confusing overloading of methods
   
   Method DruidVolcanoCost.equals(..) could be confused with overloaded method [equals](1), since dispatch depends on static types.
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4927)



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidQueryGenerator.java:
##########
@@ -0,0 +1,336 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.calcite.plan.RelOptTable;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelShuttleImpl;
+import org.apache.calcite.rel.core.Aggregate;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.core.Sort;
+import org.apache.calcite.rel.core.TableFunctionScan;
+import org.apache.calcite.rel.core.TableScan;
+import org.apache.calcite.rel.logical.LogicalAggregate;
+import org.apache.calcite.rel.logical.LogicalCorrelate;
+import org.apache.calcite.rel.logical.LogicalExchange;
+import org.apache.calcite.rel.logical.LogicalFilter;
+import org.apache.calcite.rel.logical.LogicalIntersect;
+import org.apache.calcite.rel.logical.LogicalJoin;
+import org.apache.calcite.rel.logical.LogicalMatch;
+import org.apache.calcite.rel.logical.LogicalMinus;
+import org.apache.calcite.rel.logical.LogicalProject;
+import org.apache.calcite.rel.logical.LogicalSort;
+import org.apache.calcite.rel.logical.LogicalUnion;
+import org.apache.calcite.rel.logical.LogicalValues;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.InlineDataSource;
+import org.apache.druid.segment.column.RowSignature;
+import org.apache.druid.sql.calcite.rel.PartialDruidQuery;
+import org.apache.druid.sql.calcite.rel.logical.DruidTableScan;
+import org.apache.druid.sql.calcite.rule.DruidLogicalValuesRule;
+import org.apache.druid.sql.calcite.table.DruidTable;
+import org.apache.druid.sql.calcite.table.InlineTable;
+import org.apache.druid.sql.calcite.table.RowSignatures;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Converts a DAG of {@link org.apache.druid.sql.calcite.rel.logical.DruidLogicalNode} convention to a native
+ * Druid query for execution. The convertion is done via a {@link org.apache.calcite.rel.RelShuttle} visitor
+ * implementation.
+ */
+public class DruidQueryGenerator extends RelShuttleImpl
+{
+  private final List<PartialDruidQuery> queryList = new ArrayList<>();
+  private final List<DruidTable> queryTables = new ArrayList<>();
+  private final PlannerContext plannerContext;
+  private PartialDruidQuery partialDruidQuery;
+  private PartialDruidQuery.Stage currentStage = null;
+  private DruidTable currentTable = null;
+  private boolean isRoot = true;
+
+  public DruidQueryGenerator(PlannerContext plannerContext)
+  {
+    this.plannerContext = plannerContext;
+  }
+
+  @Override
+  public RelNode visit(TableScan scan)
+  {
+    if (!(scan instanceof DruidTableScan)) {
+      throw new ISE("Planning hasn't converted logical table scan to druid convention");
+    }
+    DruidTableScan druidTableScan = (DruidTableScan) scan;
+    isRoot = false;
+    RelNode result = super.visit(scan);
+    partialDruidQuery = PartialDruidQuery.create(scan);
+    currentStage = PartialDruidQuery.Stage.SCAN;
+    final RelOptTable table = scan.getTable();
+    final DruidTable druidTable = table.unwrap(DruidTable.class);
+    if (druidTable != null) {
+      currentTable = druidTable;
+    }
+    if (druidTableScan.getProject() != null) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(druidTableScan.getProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(TableFunctionScan scan)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalValues values)
+  {
+    isRoot = false;
+    RelNode result = super.visit(values);
+    final List<ImmutableList<RexLiteral>> tuples = values.getTuples();
+    final List<Object[]> objectTuples = tuples
+        .stream()
+        .map(tuple -> tuple
+            .stream()
+            .map(v -> DruidLogicalValuesRule.getValueFromLiteral(v, plannerContext))
+            .collect(Collectors.toList())
+            .toArray(new Object[0])
+        )
+        .collect(Collectors.toList());
+    RowSignature rowSignature = RowSignatures.fromRelDataType(
+        values.getRowType().getFieldNames(),
+        values.getRowType()
+    );
+    currentTable = new InlineTable(InlineDataSource.fromIterable(objectTuples, rowSignature));
+    if (currentStage == null) {
+      partialDruidQuery = PartialDruidQuery.create(values);
+      currentStage = PartialDruidQuery.Stage.SCAN;
+    } else {
+      throw new ISE("Values node found at non leaf node in the plan");
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalFilter filter)
+  {
+    return visitFilter(filter);
+  }
+
+  public RelNode visitFilter(Filter filter)
+  {
+    isRoot = false;
+    RelNode result = super.visit(filter);
+    if (currentStage == PartialDruidQuery.Stage.AGGREGATE) {
+      partialDruidQuery = partialDruidQuery.withHavingFilter(filter);
+      currentStage = PartialDruidQuery.Stage.HAVING_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN) {
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      PartialDruidQuery old = partialDruidQuery;
+      partialDruidQuery = PartialDruidQuery.create(old.getScan());
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      partialDruidQuery = partialDruidQuery.withSelectProject(old.getSelectProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalProject project)
+  {
+    return visitProject(project);
+  }
+
+  @Override
+  public RelNode visit(LogicalJoin join)
+  {
+    throw new UnsupportedOperationException("Found join");
+  }
+
+  @Override
+  public RelNode visit(LogicalCorrelate correlate)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalUnion union)
+  {
+    throw new UnsupportedOperationException("Found union");
+  }
+
+  @Override
+  public RelNode visit(LogicalIntersect intersect)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalMinus minus)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalAggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalMatch match)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalSort sort)
+  {
+    return visitSort(sort);
+  }
+
+  @Override
+  public RelNode visit(LogicalExchange exchange)
+  {
+    return null;
+  }
+
+  private RelNode visitProject(Project project)
+  {
+    boolean rootForReal = isRoot;
+    isRoot = false;
+    RelNode result = super.visit(project);
+    if (rootForReal && (currentStage == PartialDruidQuery.Stage.AGGREGATE
+                        || currentStage == PartialDruidQuery.Stage.HAVING_FILTER)) {
+      partialDruidQuery = partialDruidQuery.withAggregateProject(project);
+      currentStage = PartialDruidQuery.Stage.AGGREGATE_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN || currentStage == PartialDruidQuery.Stage.WHERE_FILTER) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      partialDruidQuery = partialDruidQuery.mergeProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SORT) {
+      partialDruidQuery = partialDruidQuery.withSortProject(project);
+      currentStage = PartialDruidQuery.Stage.SORT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  private RelNode visitSort(Sort sort)
+  {
+    isRoot = false;
+    RelNode result = super.visit(sort);
+    if (PartialDruidQuery.Stage.SORT.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withSort(sort);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSort(sort);
+    }
+    currentStage = PartialDruidQuery.Stage.SORT;
+    return result;
+  }
+
+  private RelNode visitAggregate(Aggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(RelNode other)
+  {
+    if (other instanceof TableScan) {
+      return visit((TableScan) other);
+    } else if (other instanceof Project) {
+      return visitProject((Project) other);
+    } else if (other instanceof Sort) {
+      return visitSort((Sort) other);
+    } else if (other instanceof Aggregate) {
+      return visitAggregate((Aggregate) other);
+    } else if (other instanceof Filter) {
+      return visitFilter((Filter) other);
+    } else if (other instanceof LogicalValues) {
+      return visit((LogicalValues) other);
+    }
+
+    return super.visit(other);
+  }
+
+  public PartialDruidQuery getPartialDruidQuery()
+  {
+    return partialDruidQuery;
+  }
+
+  public List<PartialDruidQuery> getQueryList()
+  {
+    return queryList;
+  }
+
+  public List<DruidTable> getQueryTables()

Review Comment:
   ## Exposing internal representation
   
   getQueryTables exposes the internal representation stored in field queryTables. The value may be modified [after this call to getQueryTables](1).
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4928)



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidQueryGenerator.java:
##########
@@ -0,0 +1,336 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.calcite.plan.RelOptTable;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelShuttleImpl;
+import org.apache.calcite.rel.core.Aggregate;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.core.Sort;
+import org.apache.calcite.rel.core.TableFunctionScan;
+import org.apache.calcite.rel.core.TableScan;
+import org.apache.calcite.rel.logical.LogicalAggregate;
+import org.apache.calcite.rel.logical.LogicalCorrelate;
+import org.apache.calcite.rel.logical.LogicalExchange;
+import org.apache.calcite.rel.logical.LogicalFilter;
+import org.apache.calcite.rel.logical.LogicalIntersect;
+import org.apache.calcite.rel.logical.LogicalJoin;
+import org.apache.calcite.rel.logical.LogicalMatch;
+import org.apache.calcite.rel.logical.LogicalMinus;
+import org.apache.calcite.rel.logical.LogicalProject;
+import org.apache.calcite.rel.logical.LogicalSort;
+import org.apache.calcite.rel.logical.LogicalUnion;
+import org.apache.calcite.rel.logical.LogicalValues;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.InlineDataSource;
+import org.apache.druid.segment.column.RowSignature;
+import org.apache.druid.sql.calcite.rel.PartialDruidQuery;
+import org.apache.druid.sql.calcite.rel.logical.DruidTableScan;
+import org.apache.druid.sql.calcite.rule.DruidLogicalValuesRule;
+import org.apache.druid.sql.calcite.table.DruidTable;
+import org.apache.druid.sql.calcite.table.InlineTable;
+import org.apache.druid.sql.calcite.table.RowSignatures;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Converts a DAG of {@link org.apache.druid.sql.calcite.rel.logical.DruidLogicalNode} convention to a native
+ * Druid query for execution. The convertion is done via a {@link org.apache.calcite.rel.RelShuttle} visitor
+ * implementation.
+ */
+public class DruidQueryGenerator extends RelShuttleImpl
+{
+  private final List<PartialDruidQuery> queryList = new ArrayList<>();
+  private final List<DruidTable> queryTables = new ArrayList<>();
+  private final PlannerContext plannerContext;
+  private PartialDruidQuery partialDruidQuery;
+  private PartialDruidQuery.Stage currentStage = null;
+  private DruidTable currentTable = null;
+  private boolean isRoot = true;
+
+  public DruidQueryGenerator(PlannerContext plannerContext)
+  {
+    this.plannerContext = plannerContext;
+  }
+
+  @Override
+  public RelNode visit(TableScan scan)
+  {
+    if (!(scan instanceof DruidTableScan)) {
+      throw new ISE("Planning hasn't converted logical table scan to druid convention");
+    }
+    DruidTableScan druidTableScan = (DruidTableScan) scan;
+    isRoot = false;
+    RelNode result = super.visit(scan);
+    partialDruidQuery = PartialDruidQuery.create(scan);
+    currentStage = PartialDruidQuery.Stage.SCAN;
+    final RelOptTable table = scan.getTable();
+    final DruidTable druidTable = table.unwrap(DruidTable.class);
+    if (druidTable != null) {
+      currentTable = druidTable;
+    }
+    if (druidTableScan.getProject() != null) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(druidTableScan.getProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(TableFunctionScan scan)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalValues values)
+  {
+    isRoot = false;
+    RelNode result = super.visit(values);
+    final List<ImmutableList<RexLiteral>> tuples = values.getTuples();
+    final List<Object[]> objectTuples = tuples
+        .stream()
+        .map(tuple -> tuple
+            .stream()
+            .map(v -> DruidLogicalValuesRule.getValueFromLiteral(v, plannerContext))
+            .collect(Collectors.toList())
+            .toArray(new Object[0])
+        )
+        .collect(Collectors.toList());
+    RowSignature rowSignature = RowSignatures.fromRelDataType(
+        values.getRowType().getFieldNames(),
+        values.getRowType()
+    );
+    currentTable = new InlineTable(InlineDataSource.fromIterable(objectTuples, rowSignature));
+    if (currentStage == null) {
+      partialDruidQuery = PartialDruidQuery.create(values);
+      currentStage = PartialDruidQuery.Stage.SCAN;
+    } else {
+      throw new ISE("Values node found at non leaf node in the plan");
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalFilter filter)
+  {
+    return visitFilter(filter);
+  }
+
+  public RelNode visitFilter(Filter filter)
+  {
+    isRoot = false;
+    RelNode result = super.visit(filter);
+    if (currentStage == PartialDruidQuery.Stage.AGGREGATE) {
+      partialDruidQuery = partialDruidQuery.withHavingFilter(filter);
+      currentStage = PartialDruidQuery.Stage.HAVING_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN) {
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      PartialDruidQuery old = partialDruidQuery;
+      partialDruidQuery = PartialDruidQuery.create(old.getScan());
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      partialDruidQuery = partialDruidQuery.withSelectProject(old.getSelectProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalProject project)
+  {
+    return visitProject(project);
+  }
+
+  @Override
+  public RelNode visit(LogicalJoin join)
+  {
+    throw new UnsupportedOperationException("Found join");
+  }
+
+  @Override
+  public RelNode visit(LogicalCorrelate correlate)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalUnion union)
+  {
+    throw new UnsupportedOperationException("Found union");
+  }
+
+  @Override
+  public RelNode visit(LogicalIntersect intersect)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalMinus minus)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalAggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalMatch match)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalSort sort)
+  {
+    return visitSort(sort);
+  }
+
+  @Override
+  public RelNode visit(LogicalExchange exchange)
+  {
+    return null;
+  }
+
+  private RelNode visitProject(Project project)
+  {
+    boolean rootForReal = isRoot;
+    isRoot = false;
+    RelNode result = super.visit(project);
+    if (rootForReal && (currentStage == PartialDruidQuery.Stage.AGGREGATE
+                        || currentStage == PartialDruidQuery.Stage.HAVING_FILTER)) {
+      partialDruidQuery = partialDruidQuery.withAggregateProject(project);
+      currentStage = PartialDruidQuery.Stage.AGGREGATE_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN || currentStage == PartialDruidQuery.Stage.WHERE_FILTER) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      partialDruidQuery = partialDruidQuery.mergeProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SORT) {
+      partialDruidQuery = partialDruidQuery.withSortProject(project);
+      currentStage = PartialDruidQuery.Stage.SORT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  private RelNode visitSort(Sort sort)
+  {
+    isRoot = false;
+    RelNode result = super.visit(sort);
+    if (PartialDruidQuery.Stage.SORT.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withSort(sort);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSort(sort);
+    }
+    currentStage = PartialDruidQuery.Stage.SORT;
+    return result;
+  }
+
+  private RelNode visitAggregate(Aggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(RelNode other)
+  {
+    if (other instanceof TableScan) {

Review Comment:
   ## Chain of 'instanceof' tests
   
   This if block performs a chain of 6 type tests - consider alternatives, e.g. polymorphism or the visitor pattern.
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4930)



##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/logical/DruidTableScan.java:
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.rel.logical;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelOptCost;
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.calcite.plan.RelOptTable;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.RelCollationTraitDef;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelWriter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.core.TableScan;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.schema.Table;
+
+import java.util.List;
+
+/**
+ * {@link DruidLogicalNode} convention node for {@link TableScan} plan node.
+ */
+public class DruidTableScan extends TableScan implements DruidLogicalNode

Review Comment:
   ## No clone method
   
   No clone method, yet implements Cloneable.
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4932)



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidQueryGenerator.java:
##########
@@ -0,0 +1,336 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.calcite.plan.RelOptTable;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelShuttleImpl;
+import org.apache.calcite.rel.core.Aggregate;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.core.Sort;
+import org.apache.calcite.rel.core.TableFunctionScan;
+import org.apache.calcite.rel.core.TableScan;
+import org.apache.calcite.rel.logical.LogicalAggregate;
+import org.apache.calcite.rel.logical.LogicalCorrelate;
+import org.apache.calcite.rel.logical.LogicalExchange;
+import org.apache.calcite.rel.logical.LogicalFilter;
+import org.apache.calcite.rel.logical.LogicalIntersect;
+import org.apache.calcite.rel.logical.LogicalJoin;
+import org.apache.calcite.rel.logical.LogicalMatch;
+import org.apache.calcite.rel.logical.LogicalMinus;
+import org.apache.calcite.rel.logical.LogicalProject;
+import org.apache.calcite.rel.logical.LogicalSort;
+import org.apache.calcite.rel.logical.LogicalUnion;
+import org.apache.calcite.rel.logical.LogicalValues;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.InlineDataSource;
+import org.apache.druid.segment.column.RowSignature;
+import org.apache.druid.sql.calcite.rel.PartialDruidQuery;
+import org.apache.druid.sql.calcite.rel.logical.DruidTableScan;
+import org.apache.druid.sql.calcite.rule.DruidLogicalValuesRule;
+import org.apache.druid.sql.calcite.table.DruidTable;
+import org.apache.druid.sql.calcite.table.InlineTable;
+import org.apache.druid.sql.calcite.table.RowSignatures;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Converts a DAG of {@link org.apache.druid.sql.calcite.rel.logical.DruidLogicalNode} convention to a native
+ * Druid query for execution. The convertion is done via a {@link org.apache.calcite.rel.RelShuttle} visitor
+ * implementation.
+ */
+public class DruidQueryGenerator extends RelShuttleImpl
+{
+  private final List<PartialDruidQuery> queryList = new ArrayList<>();
+  private final List<DruidTable> queryTables = new ArrayList<>();
+  private final PlannerContext plannerContext;
+  private PartialDruidQuery partialDruidQuery;
+  private PartialDruidQuery.Stage currentStage = null;
+  private DruidTable currentTable = null;
+  private boolean isRoot = true;
+
+  public DruidQueryGenerator(PlannerContext plannerContext)
+  {
+    this.plannerContext = plannerContext;
+  }
+
+  @Override
+  public RelNode visit(TableScan scan)
+  {
+    if (!(scan instanceof DruidTableScan)) {
+      throw new ISE("Planning hasn't converted logical table scan to druid convention");
+    }
+    DruidTableScan druidTableScan = (DruidTableScan) scan;
+    isRoot = false;
+    RelNode result = super.visit(scan);
+    partialDruidQuery = PartialDruidQuery.create(scan);
+    currentStage = PartialDruidQuery.Stage.SCAN;
+    final RelOptTable table = scan.getTable();
+    final DruidTable druidTable = table.unwrap(DruidTable.class);
+    if (druidTable != null) {
+      currentTable = druidTable;
+    }
+    if (druidTableScan.getProject() != null) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(druidTableScan.getProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(TableFunctionScan scan)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalValues values)
+  {
+    isRoot = false;
+    RelNode result = super.visit(values);
+    final List<ImmutableList<RexLiteral>> tuples = values.getTuples();
+    final List<Object[]> objectTuples = tuples
+        .stream()
+        .map(tuple -> tuple
+            .stream()
+            .map(v -> DruidLogicalValuesRule.getValueFromLiteral(v, plannerContext))
+            .collect(Collectors.toList())
+            .toArray(new Object[0])
+        )
+        .collect(Collectors.toList());
+    RowSignature rowSignature = RowSignatures.fromRelDataType(
+        values.getRowType().getFieldNames(),
+        values.getRowType()
+    );
+    currentTable = new InlineTable(InlineDataSource.fromIterable(objectTuples, rowSignature));
+    if (currentStage == null) {
+      partialDruidQuery = PartialDruidQuery.create(values);
+      currentStage = PartialDruidQuery.Stage.SCAN;
+    } else {
+      throw new ISE("Values node found at non leaf node in the plan");
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalFilter filter)
+  {
+    return visitFilter(filter);
+  }
+
+  public RelNode visitFilter(Filter filter)
+  {
+    isRoot = false;
+    RelNode result = super.visit(filter);
+    if (currentStage == PartialDruidQuery.Stage.AGGREGATE) {
+      partialDruidQuery = partialDruidQuery.withHavingFilter(filter);
+      currentStage = PartialDruidQuery.Stage.HAVING_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN) {
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      PartialDruidQuery old = partialDruidQuery;
+      partialDruidQuery = PartialDruidQuery.create(old.getScan());
+      partialDruidQuery = partialDruidQuery.withWhereFilter(filter);
+      partialDruidQuery = partialDruidQuery.withSelectProject(old.getSelectProject());
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withWhereFilter(filter);
+      currentStage = PartialDruidQuery.Stage.WHERE_FILTER;
+    }
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalProject project)
+  {
+    return visitProject(project);
+  }
+
+  @Override
+  public RelNode visit(LogicalJoin join)
+  {
+    throw new UnsupportedOperationException("Found join");
+  }
+
+  @Override
+  public RelNode visit(LogicalCorrelate correlate)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalUnion union)
+  {
+    throw new UnsupportedOperationException("Found union");
+  }
+
+  @Override
+  public RelNode visit(LogicalIntersect intersect)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalMinus minus)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalAggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(LogicalMatch match)
+  {
+    return null;
+  }
+
+  @Override
+  public RelNode visit(LogicalSort sort)
+  {
+    return visitSort(sort);
+  }
+
+  @Override
+  public RelNode visit(LogicalExchange exchange)
+  {
+    return null;
+  }
+
+  private RelNode visitProject(Project project)
+  {
+    boolean rootForReal = isRoot;
+    isRoot = false;
+    RelNode result = super.visit(project);
+    if (rootForReal && (currentStage == PartialDruidQuery.Stage.AGGREGATE
+                        || currentStage == PartialDruidQuery.Stage.HAVING_FILTER)) {
+      partialDruidQuery = partialDruidQuery.withAggregateProject(project);
+      currentStage = PartialDruidQuery.Stage.AGGREGATE_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SCAN || currentStage == PartialDruidQuery.Stage.WHERE_FILTER) {
+      partialDruidQuery = partialDruidQuery.withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SELECT_PROJECT) {
+      partialDruidQuery = partialDruidQuery.mergeProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    } else if (currentStage == PartialDruidQuery.Stage.SORT) {
+      partialDruidQuery = partialDruidQuery.withSortProject(project);
+      currentStage = PartialDruidQuery.Stage.SORT_PROJECT;
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSelectProject(project);
+      currentStage = PartialDruidQuery.Stage.SELECT_PROJECT;
+    }
+    return result;
+  }
+
+  private RelNode visitSort(Sort sort)
+  {
+    isRoot = false;
+    RelNode result = super.visit(sort);
+    if (PartialDruidQuery.Stage.SORT.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withSort(sort);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withSort(sort);
+    }
+    currentStage = PartialDruidQuery.Stage.SORT;
+    return result;
+  }
+
+  private RelNode visitAggregate(Aggregate aggregate)
+  {
+    isRoot = false;
+    RelNode result = super.visit(aggregate);
+    if (PartialDruidQuery.Stage.AGGREGATE.canFollow(currentStage)) {
+      partialDruidQuery = partialDruidQuery.withAggregate(aggregate);
+    } else {
+      queryList.add(partialDruidQuery);
+      queryTables.add(currentTable);
+      partialDruidQuery = PartialDruidQuery.createOuterQuery(partialDruidQuery).withAggregate(aggregate);
+    }
+    currentStage = PartialDruidQuery.Stage.AGGREGATE;
+    return result;
+  }
+
+  @Override
+  public RelNode visit(RelNode other)
+  {
+    if (other instanceof TableScan) {
+      return visit((TableScan) other);
+    } else if (other instanceof Project) {
+      return visitProject((Project) other);
+    } else if (other instanceof Sort) {
+      return visitSort((Sort) other);
+    } else if (other instanceof Aggregate) {
+      return visitAggregate((Aggregate) other);
+    } else if (other instanceof Filter) {
+      return visitFilter((Filter) other);
+    } else if (other instanceof LogicalValues) {
+      return visit((LogicalValues) other);
+    }
+
+    return super.visit(other);
+  }
+
+  public PartialDruidQuery getPartialDruidQuery()
+  {
+    return partialDruidQuery;
+  }
+
+  public List<PartialDruidQuery> getQueryList()

Review Comment:
   ## Exposing internal representation
   
   getQueryList exposes the internal representation stored in field queryList. The value may be modified [after this call to getQueryList](1).
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4929)



##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/logical/DruidAggregate.java:
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.rel.logical;
+
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelOptCost;
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelWriter;
+import org.apache.calcite.rel.core.Aggregate;
+import org.apache.calcite.rel.core.AggregateCall;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.druid.sql.calcite.planner.PlannerContext;
+import org.apache.druid.sql.calcite.rel.CostEstimates;
+
+import java.util.List;
+
+/**
+ * {@link DruidLogicalNode} convention node for {@link Aggregate} plan node.
+ */
+public class DruidAggregate extends Aggregate implements DruidLogicalNode

Review Comment:
   ## No clone method
   
   No clone method, yet implements Cloneable.
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4931)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] cheddar merged pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "cheddar (via GitHub)" <gi...@apache.org>.

cheddar merged PR #14232:
URL: https://github.com/apache/druid/pull/14232


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] imply-cheddar commented on a diff in pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "imply-cheddar (via GitHub)" <gi...@apache.org>.

imply-cheddar commented on code in PR #14232:
URL: https://github.com/apache/druid/pull/14232#discussion_r1233505654


##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/PlannerFactory.java:
##########
@@ -163,6 +164,7 @@ private FrameworkConfig buildFrameworkConfig(PlannerContext plannerContext)
         .typeSystem(DruidTypeSystem.INSTANCE)
         .defaultSchema(rootSchema.getSubSchema(druidSchemaName))
         .sqlToRelConverterConfig(sqlToRelConverterConfig)
+        .costFactory(new DruidVolcanoCost.Factory())

Review Comment:
   You have the if statement below to add this if it's in DECOUPLED mode, but it would appear as though you are adding it in all instances as well.  It's interesting that all tests continue to pass, which should give us confidence to just attach this in all cases, but let's be conservative and minimize impact to other modes of planning for now, so I think we can remove this.



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/PlannerFactory.java:
##########
@@ -191,7 +193,15 @@ public SqlConformance conformance()
               return null;
             }
           }
-        })
-        .build();
+        });
+
+    if (plannerConfig().getNativeQuerySqlPlanningMode()
+                       .equals(PlannerConfig.NATIVE_QUERY_SQL_PLANNING_MODE_DECOUPLED)

Review Comment:
   `getNativeQuerySqlPlanningMode()` could maybe be null and generate an NPE.  There is perhaps a `@NotNull` annotation to try to protect from it, but you can just write your equals better and not have to rely on such things:
   
   ```
   PlannerConfig.NATIVE_QUERY_SQL_PLANNING_MODE_DECOUPLED.equals(plannerConfig().getNativeSqlPlanningMode())
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] rohangarg commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.

rohangarg commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1600508796

   @abhishekagarwal87 : An example of the intermediate DAG looks like : 
   
   ```
   DruidSort(sort0=[$2], sort1=[$0], dir0=[DESC], dir1=[ASC], druid=[logical]): rowcount = 100.0, cumulative cost = {201.0 rows, 1067.0 cpu, 0.0 io}, id = 136
     DruidProject(dim1=[$0], EXPR$1=[SUBSTRING($0, 2)], EXPR$2=[CHAR_LENGTH($0)], druid=[logical]): rowcount = 100.0, cumulative cost = {101.0 rows, 67.0 cpu, 0.0 io}, id = 135
       DruidAggregate(group=[{1}], druid=[logical]): rowcount = 100.0, cumulative cost = {101.0 rows, 17.0 cpu, 0.0 io}, id = 134
         DruidTableScan(table=[[druid, foo]], druid=[logical]): rowcount = 1000.0, cumulative cost = {tiny}, id = 130
   ```
   
   This is for the sql query : 
   ```
   SELECT
     dim1, SUBSTRING(dim1, 2)
   FROM druid.foo
   GROUP BY dim1
   ORDER BY CHARACTER_LENGTH(dim1) DESC, dim1
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] clintropolis commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.

clintropolis commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1558508190

   +1 for design proposal


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] abhishekagarwal87 commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.

abhishekagarwal87 commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1598096912

   @rohangarg - also what does that intermediate DAG look like? can you post an example? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] rohangarg commented on a diff in pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.

rohangarg commented on code in PR #14232:
URL: https://github.com/apache/druid/pull/14232#discussion_r1233919269


##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/PlannerFactory.java:
##########
@@ -191,7 +193,15 @@ public SqlConformance conformance()
               return null;
             }
           }
-        })
-        .build();
+        });
+
+    if (plannerConfig().getNativeQuerySqlPlanningMode()
+                       .equals(PlannerConfig.NATIVE_QUERY_SQL_PLANNING_MODE_DECOUPLED)

Review Comment:
   thanks for the catch, fixed it!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] cheddar commented on pull request #14232: Decouple logical planning and native query generation in SQL planning

Posted by "cheddar (via GitHub)" <gi...@apache.org>.

cheddar commented on PR #14232:
URL: https://github.com/apache/druid/pull/14232#issuecomment-1597856537

   I validated that this won't impact the current code paths, it requires a specific context parameter to be set to opt-in and will not impact any current behaviors.  As such, I'm going to go ahead and merge this so that we can get it in and keep iterating on making more tests pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org