You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2022/02/14 09:31:32 UTC

[GitHub] [calcite] zabetak commented on a change in pull request #2690: RepeatUnion improvements: CALCITE-3673 + CALCITE-4054

zabetak commented on a change in pull request #2690:
URL: https://github.com/apache/calcite/pull/2690#discussion_r805648149



##########
File path: core/src/main/java/org/apache/calcite/rel/core/RepeatUnion.java
##########
@@ -61,12 +64,20 @@
    */
   public final int iterationLimit;
 
+  /**
+   * Transient table where repeat union's intermediate results will be stored.
+   */
+  protected final RelOptTable transientTable;
+

Review comment:
       Do we need to update the class javadoc based on this change to explain how is this table supposed to be used? Also can this be null (optional) in case some concrete implementations do not want to make use of it? 

##########
File path: core/src/test/java/org/apache/calcite/test/enumerable/EnumerableRepeatUnionTest.java
##########
@@ -233,4 +245,83 @@
         .returnsOrdered("i=1", "i=2", "i=null", "i=3", "i=2", "i=3", "i=3");
   }
 
+  /** Test case for
+   * <a href="https://issues.apache.org/jira/browse/CALCITE-4054">[CALCITE-4054]
+   * RepeatUnion containing a Correlate with a transientScan on its RHS causes NPE</a>. */
+  @Test void testRepeatUnionWithCorrelateWithTransientScanOnItsRight() {
+    CalciteAssert.that()
+        .with(CalciteConnectionProperty.LEX, Lex.JAVA)
+        .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
+        .withSchema("s", new ReflectiveSchema(new HierarchySchema()))
+        .query("?")
+        .withHook(Hook.PLANNER, (Consumer<RelOptPlanner>) planner -> {
+          planner.addRule(JoinToCorrelateRule.Config.DEFAULT.toRule());
+          planner.removeRule(JoinCommuteRule.Config.DEFAULT.toRule());
+          planner.removeRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
+          planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
+        })
+        .withRel(builder -> {
+          builder
+              //   WITH RECURSIVE delta(empid, name) as (
+              //     SELECT empid, name FROM emps WHERE empid = 2
+              //     UNION ALL
+              //     SELECT e.empid, e.name FROM delta d
+              //                            JOIN hierarchies h ON d.empid = h.managerid
+              //                            JOIN emps e        ON h.subordinateid = e.empid
+              //   )
+              //   SELECT empid, name FROM delta
+              .scan("s", "emps")
+              .filter(
+                  builder.equals(
+                      builder.field("empid"),
+                      builder.literal(2)))
+              .project(
+                  builder.field("emps", "empid"),
+                  builder.field("emps", "name"))
+
+              .transientScan("#DELTA#");
+          RelNode transientScan = builder.build(); // pop the transientScan to use it later
+
+          builder
+              .scan("s", "hierarchies")
+              .push(transientScan) // use the transientScan as right input of the join
+              .join(
+                  JoinRelType.INNER,
+                  builder.equals(
+                      builder.field(2, "#DELTA#", "empid"),
+                      builder.field(2, "hierarchies", "managerid")))
+
+              .scan("s", "emps")
+              .join(
+                  JoinRelType.INNER,
+                  builder.equals(
+                      builder.field(2, "hierarchies", "subordinateid"),
+                      builder.field(2, "emps", "empid")))
+              .project(
+                  builder.field("emps", "empid"),
+                  builder.field("emps", "name"))
+              .repeatUnion("#DELTA#", true);
+          return builder.build();
+        })
+        .explainHookMatches(""
+            + "EnumerableRepeatUnion(all=[true])\n"
+            + "  EnumerableTableSpool(readType=[LAZY], writeType=[LAZY], table=[[#DELTA#]])\n"
+            + "    EnumerableCalc(expr#0..4=[{inputs}], expr#5=[2], expr#6=[=($t0, $t5)], empid=[$t0], name=[$t2], $condition=[$t6])\n"
+            + "      EnumerableTableScan(table=[[s, emps]])\n"
+            + "  EnumerableTableSpool(readType=[LAZY], writeType=[LAZY], table=[[#DELTA#]])\n"
+            + "    EnumerableCalc(expr#0..8=[{inputs}], empid=[$t4], name=[$t6])\n"
+            + "      EnumerableCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{1}])\n"
+            // It is important to have EnumerableCorrelate + #DELTA# table scan on its right

Review comment:
       An explanation of **why** is it important would be helpful.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org