You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2021/05/21 18:44:40 UTC

[GitHub] [calcite] scrozon opened a new pull request #2417: [CALCITE-4617] fix offset in SortJoinTransposeRule

scrozon opened a new pull request #2417:
URL: https://github.com/apache/calcite/pull/2417


   The sort node added on top of the join shouldn't re-apply the offset of the original sort node, otherwise we end up skipping rows twice


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r638801193



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       I think e.g. adding the following test at the end of `core\src\test\resources\sql\join.iq` (right before the `# End join.iq` line) should show the issue:
   ```
   ### [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset
   select * from (select * from "scott".emp) e left join (
     select * from "scott".dept d) using (deptno)
   order by empno limit 3 offset 1;
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   | DEPTNO | EMPNO | ENAME  | JOB       | MGR  | HIREDATE   | SAL     | COMM    | DNAME      | LOC      |
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   |     30 |  7499 | ALLEN  | SALESMAN  | 7698 | 1981-02-20 | 1600.00 |  300.00 | SALES      | CHICAGO  |
   |     30 |  7521 | WARD   | SALESMAN  | 7698 | 1981-02-22 | 1250.00 |  500.00 | SALES      | CHICAGO  |
   |     20 |  7566 | JONES  | MANAGER   | 7839 | 1981-02-04 | 2975.00 |         | RESEARCH   | DALLAS   |
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   (3 rows)
   
   !ok
   EnumerableCalc(expr#0..10=[{inputs}], expr#11=[COALESCE($t7, $t8)], DEPTNO=[$t11], EMPNO=[$t0], ENAME=[$t1], JOB=[$t2], MGR=[$t3], HIREDATE=[$t4], SAL=[$t5], COMM=[$t6], DNAME=[$t9], LOC=[$t10])
     EnumerableLimit(fetch=[3])
       EnumerableHashJoin(condition=[=($7, $8)], joinType=[left])
         EnumerableLimit(offset=[1], fetch=[3])
           EnumerableTableScan(table=[[scott, EMP]])
         EnumerableTableScan(table=[[scott, DEPT]])
   !plan
   ```
   
   The test fails since it does not return the expected result / plan. You can run this test via `CoreQuidemTest.java`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] scrozon commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
scrozon commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r638839071



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       thanks @rubenada I added the test and confirmed that before the change, the results were wrong:
   ```
   select * from (select * from "scott".emp) e left join (
     select * from "scott".dept d) using (deptno)
   order by empno limit 3 offset 1;
   +--------+-------+-------+----------+------+------------+---------+--------+----------+---------+
   | DEPTNO | EMPNO | ENAME | JOB      | MGR  | HIREDATE   | SAL     | COMM   | DNAME    | LOC     |
   +--------+-------+-------+----------+------+------------+---------+--------+----------+---------+
   |     30 |  7521 | WARD  | SALESMAN | 7698 | 1981-02-22 | 1250.00 | 500.00 | SALES    | CHICAGO |
   |     20 |  7566 | JONES | MANAGER  | 7839 | 1981-02-04 | 2975.00 |        | RESEARCH | DALLAS  |
   +--------+-------+-------+----------+------+------------+---------+--------+----------+---------+
   (2 rows)
   
   !ok
   EnumerableCalc(expr#0..10=[{inputs}], expr#11=[COALESCE($t7, $t8)], DEPTNO=[$t11], EMPNO=[$t0], ENAME=[$t1], JOB=[$t2], MGR=[$t3], HIREDATE=[$t4], SAL=[$t5], COMM=[$t6], DNAME=[$t9], LOC=[$t10])
     EnumerableLimit(offset=[1], fetch=[3])
       EnumerableHashJoin(condition=[=($7, $8)], joinType=[left])
         EnumerableLimit(offset=[1], fetch=[3])
           EnumerableTableScan(table=[[scott, EMP]])
         EnumerableTableScan(table=[[scott, DEPT]])
   !plan
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r637856356



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       I meant a unit test that verified the query result, not the query plan.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r638801193



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       I think adding the following test at the end of `core\src\test\resources\sql\join.iq` (right before the `# End join.iq` line) should show the issue:
   ```
   ### [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset
   select * from (select * from "scott".emp) e left join (
     select * from "scott".dept d) using (deptno)
   order by empno limit 3 offset 1;
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   | DEPTNO | EMPNO | ENAME  | JOB       | MGR  | HIREDATE   | SAL     | COMM    | DNAME      | LOC      |
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   |     30 |  7499 | ALLEN  | SALESMAN  | 7698 | 1981-02-20 | 1600.00 |  300.00 | SALES      | CHICAGO  |
   |     30 |  7521 | WARD   | SALESMAN  | 7698 | 1981-02-22 | 1250.00 |  500.00 | SALES      | CHICAGO  |
   |     20 |  7566 | JONES  | MANAGER   | 7839 | 1981-02-04 | 2975.00 |         | RESEARCH   | DALLAS   |
   +--------+-------+--------+-----------+------+------------+---------+---------+------------+----------+
   (3 rows)
   
   !ok
   EnumerableCalc(expr#0..10=[{inputs}], expr#11=[COALESCE($t7, $t8)], DEPTNO=[$t11], EMPNO=[$t0], ENAME=[$t1], JOB=[$t2], MGR=[$t3], HIREDATE=[$t4], SAL=[$t5], COMM=[$t6], DNAME=[$t9], LOC=[$t10])
     EnumerableLimit(fetch=[3])
       EnumerableHashJoin(condition=[=($7, $8)], joinType=[left])
         EnumerableLimit(offset=[1], fetch=[3])
           EnumerableTableScan(table=[[scott, EMP]])
         EnumerableTableScan(table=[[scott, DEPT]])
   !plan
   ```
   
   The test fails since it does not return the expected result / plan. You can run this test via `CoreQuidemTest.java`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#issuecomment-850798010


   @scrozon as discussed in the Jira, could you please provide more unit tests for this rule?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] scrozon commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
scrozon commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r638763790



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       I'm not very familiar with the code base and unit/int tests available. Are there existing tests I can modify that already check the results of a query? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] scrozon commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
scrozon commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r641979050



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SortJoinTransposeRule.java
##########
@@ -181,4 +190,17 @@ default Config withOperandFor(Class<? extends Sort> sortClass,
           .as(Config.class);
     }
   }
+
+  @Nullable private RexNode newLimit(RexBuilder rexBuilder,
+      @Nullable RexNode offset, @Nullable RexNode limit) {
+    if (offset == null || limit == null) {
+      return limit;
+    }
+    final BigDecimal offsetValue = ((RexLiteral) offset).getValueAs(BigDecimal.class);
+    final BigDecimal limitValue = ((RexLiteral) limit).getValueAs(BigDecimal.class);

Review comment:
       is it safe to cast to `RexLiteral` or should I add a `instanceof` check? Are there cases where `offset`/`fetch` may not be a `RexLiteral`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#issuecomment-850265910


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] rubenada commented on pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
rubenada commented on pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#issuecomment-846895756


   Would it be possible to have a unit test "showing" the error? (i.e. the test returns an incorrect result with current code, and correct result with the patch)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2417: [CALCITE-4617] Wrong offset when SortJoinTransposeRule pushes a sort node with an offset

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2417:
URL: https://github.com/apache/calcite/pull/2417#discussion_r637850397



##########
File path: core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
##########
@@ -11751,7 +11751,7 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
     <Resource name="planAfter">
       <![CDATA[
 LogicalProject(DEPTNO=[$0], EMPNO=[$2])
-  LogicalSort(offset=[2], fetch=[10])
+  LogicalSort(fetch=[10])

Review comment:
       @rubenada , I believe this is the unit test that shows the error. The plan was wrong before the batch, and the fix results in the right plan.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org