You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2021/11/18 16:37:40 UTC

[GitHub] [calcite] asolimando opened a new pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

asolimando opened a new pull request #2613:
URL: https://github.com/apache/calcite/pull/2613


   ### Overview
   MV rewrite fails when at least one expression in the project of either the view or the query references, directly or indirectly, to more than one field.
   
   ### Changes
   - Added unit tests covering such "complex" expressions in both views and queries (highlighting the limitation)
   - Improved the implementation to handle such cases as well
   - Improved the error message for (possibly remaining) unhandled cases where a single expression can have multiple lineages, turned assertions into exceptions.
   
   Regarding the tests, the generated plans look fine, I was wondering if we could add tests with some actual data to double check that the rewriting is working as expected. So far I could not find the right place to add that, I will keep digging, not sure it's strictly needed, it does not seem to be done for any other case, no matter how complex.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764732068



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));
+
+    final String actualExp = r.iterator().next().toString();
+    assertEquals("AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1),"
+        + " =([CATALOG, SALES, EMP].#0.$1, 'abc')),"
+        + " >([CATALOG, SALES, EMP].#0.$7, 1))", actualExp);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    // empno is column 0 in catalog.sales.emp
+    // deptno is column 0 in catalog.sales.dept
+    final RelNode rel = convertSql("select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));
+
+    // 'dept.deptno + empno between 1 and 2' is translated into
+    // 'dept.deptno + empno >= 1 and dept.deptno + empno <= 2'
+    final String actualExp = r.iterator().next().toString();
+    assertEquals("AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1),"

Review comment:
       I have added an assertion message, failures now look like this:
   ```
   For query 'select dept.deptno + empno between 1 and 2 from emp join dept on emp.deptno = dept.deptno':
   'empno' is column 0 in 'catalog.sales.emp', 'deptno' is column 0 in 'catalog.sales.dept', and 'dept.deptno + empno between 1 and 2' is translated into 'dept.deptno + empno >= 1 and dept.deptno + empno <= 2' ==> expected: <AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1), <=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 2)> but was: <AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1), <=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 2))>
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764703289



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");

Review comment:
       It would be great if you could move those comments and the SQL in question to the last `assertEquals` in the test.
   
   Then the assertion call would be paired with the intention, and the failure message would include the intended behavior as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r763227407



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/materialize/MaterializedViewRule.java
##########
@@ -1036,21 +1037,36 @@ protected NodeLineage generateSwapTableColumnReferencesLineage(
     final Map<RexNode, Integer> exprsLineage = new HashMap<>();
     final Map<RexNode, Integer> exprsLineageLosslessCasts = new HashMap<>();
     for (int i = 0; i < nodeExprs.size(); i++) {
-      final Set<RexNode> s = mq.getExpressionLineage(node, nodeExprs.get(i));
-      if (s == null) {
+      final Set<RexNode> lineages = mq.getExpressionLineage(node, nodeExprs.get(i));
+      if (lineages == null) {
         // Next expression
         continue;
       }
-      // We only support project - filter - join, thus it should map to
-      // a single expression
-      assert s.size() == 1;
-      // Rewrite expr. First we swap the table references following the table
-      // mapping, then we take first element from the corresponding equivalence class
-      final RexNode e = RexUtil.swapTableColumnReferences(rexBuilder,
-          s.iterator().next(), tableMapping, ec.getEquivalenceClassesMap());
-      exprsLineage.put(e, i);
-      if (RexUtil.isLosslessCast(e)) {
-        exprsLineageLosslessCasts.put(((RexCall) e).getOperands().get(0), i);
+      final RexNode expr = nodeExprs.get(i);
+      if (SqlKind.AND == expr.getKind() || SqlKind.OR == expr.getKind()) {

Review comment:
       Finally my diagnosis is that `mq.expressionsLineage` was indeed broken, it was decomposing top `AND` expressions for no apparent good reason.
   
   Luckily there were unit tests, just not where I expected them to be, I have managed to reproduce the issue in one of them, and double checked that the fix did handle it, and at the same time that nothing else broke.
   
   Please have another look @zabetak if you have some spare cycles, thanks a lot for the tip, I was heading down a slippery road :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764773927



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2483,38 +2484,43 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
   }
 
   @Test void testExpressionLineageConjuntiveExpression() {
-    // empno is column 0 in catalog.sales.emp
-    // ename is column 1 in catalog.sales.emp
-    // deptno is column 7 in catalog.sales.emp
-    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    final RelNode rel = convertSql(sql);
     final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
 
     final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
-    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
-    assertThat(r.size(), is(1));
+    final Set<RexNode> r = Objects.requireNonNull(mq.getExpressionLineage(rel, ref));

Review comment:
       The motivation is exactly the same: test failure should include meaningful information.
   If you add `requireNonNull`, `getOnlyElement`, etc, etc, then you make it harder to understand the nature of the failure.
   
   This case is trivial: the test expects a set with a single element. The assertion message is more or less OK.
   Why adding more checks before the final `assertEquals`?
   I believe, the final `assertEquals` should be more than enough to tell that "expected ... got null".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764989274



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       Sorry about that, I lost a character in the refactoring and I did not notice, it used to be a `\n`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764702164



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));

Review comment:
       It is sad when an assertion like this fails. It says like "expected 1 got 2", and you have absolutely no clue why 1 was expected, what caused 2 and so on.
   
   `assert(size)` before asserting collection contents is almost always a bad idea. There are corner cases like `assertAll`, soft asserts, etc, however, you are not using them here.
   
   In this case, I would suggest removing `assert(size` and use `set.toString()` with the relevant assertion message.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764772388



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2483,38 +2484,43 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
   }
 
   @Test void testExpressionLineageConjuntiveExpression() {
-    // empno is column 0 in catalog.sales.emp
-    // ename is column 1 in catalog.sales.emp
-    // deptno is column 7 in catalog.sales.emp
-    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    final RelNode rel = convertSql(sql);
     final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
 
     final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
-    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
-    assertThat(r.size(), is(1));
+    final Set<RexNode> r = Objects.requireNonNull(mq.getExpressionLineage(rel, ref));

Review comment:
       No, it's not strictly needed but SonarLint reminded me that `getExpressionLineage` can return `null`, I thought it could be more explicit than a NPE later on in case of regression. But I don't mind removing it if you prefer.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764929219



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2400,20 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(String sql, int columnIndex, String expected,
+      BiFunction<RexNode, RelNode, String> message) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, r.toString(), message.apply(ref, rel));

Review comment:
       `r.toString()` -> `String.valueOf(r)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764735775



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));

Review comment:
       I have replaced the assertion with the use of `Iterables.getOnlyElement`, that is: `String actualExp = Iterables.getOnlyElement(r).toString();`
   
   `Iterables.getOnlyElement` states explicitly that we were expecting a single element and we got a different number, I think it conveys the intention in case of failures and it keeps the test compact.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764706775



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/materialize/MaterializedViewJoinRule.java
##########
@@ -260,17 +260,20 @@
         : topProject.getProjects();
     List<RexNode> exprsLineage = new ArrayList<>(exprs.size());
     for (RexNode expr : exprs) {
-      Set<RexNode> s = mq.getExpressionLineage(node, expr);
-      if (s == null) {
+      Set<RexNode> lineages = mq.getExpressionLineage(node, expr);
+      if (lineages == null) {
         // Bail out
         return null;
       }
-      assert s.size() == 1;
+      if (lineages.size() != 1) {
+        throw new IllegalStateException("We only support project - filter - join, "
+            + "thus expression lineage should map to a single expression, got: " + lineages);

Review comment:
       Does it make sense to include the input `expr` and or `node` into the message as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r765012600



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       https://github.com/apache/calcite/blob/fd3c1d0bdf907d192d42c18b3aff5306eec650e7/build.gradle.kts#L542 might be relevant




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764752425



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));

Review comment:
       `Iterables.getOnlyElement` would be slightly better, however, I believe it would not convey the reason for "only one element expected" (==input SQL would be missing, etc, etc)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764764003



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2483,38 +2484,43 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
   }
 
   @Test void testExpressionLineageConjuntiveExpression() {
-    // empno is column 0 in catalog.sales.emp
-    // ename is column 1 in catalog.sales.emp
-    // deptno is column 7 in catalog.sales.emp
-    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    final RelNode rel = convertSql(sql);
     final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
 
     final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
-    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
-    assertThat(r.size(), is(1));
+    final Set<RexNode> r = Objects.requireNonNull(mq.getExpressionLineage(rel, ref));

Review comment:
       is `requireNonNull` needed here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] zabetak closed pull request #2613: [CALCITE-4894] MV rewriting fails for conjunctive top expressions in SELECT clause

Posted by GitBox <gi...@apache.org>.
zabetak closed pull request #2613:
URL: https://github.com/apache/calcite/pull/2613


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r765018998



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       I have removed the `\n` to avoid issues and filed https://github.com/autostyle/autostyle/issues/30




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764989274



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       Sorry about that, I lost a character in the refactoring and I did not notice.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764984852



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       `:n`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] zabetak commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
zabetak commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764397014



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2483,115 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageComplexExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    // check '(empno = 1 or ename = 'abc') and deptno > 1'
+    assertThat(r.size(), is(1));
+    final RexNode result = r.iterator().next();
+    assertThat(result.getKind(), is(SqlKind.AND));
+    final RexCall and = (RexCall) result;
+
+    // check '(empno = 1 or ename = 'abc')'
+    assertThat(and.getOperands().size(), is(2));
+    final RexCall or = (RexCall) and.getOperands().get(0);
+    assertThat(or.getKind(), is(SqlKind.OR));
+    assertThat(or.getOperands().size(), is(2));
+
+    // check 'empno = 1'
+    final RexCall eq1 = (RexCall) or.getOperands().get(0);
+    assertThat(eq1.getKind(), is(SqlKind.EQUALS));
+    final RexTableInputRef inputRef1 = (RexTableInputRef) eq1.getOperands().get(0);
+    assertThat(inputRef1.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef1.getIndex(), is(0));
+    final RexLiteral literal1 = (RexLiteral) eq1.getOperands().get(1);
+    assertThat(literal1.getValueAs(Integer.class), is(1));
+
+    // check 'ename = 'abc''
+    final RexCall eq2 = (RexCall) or.getOperands().get(1);
+    assertThat(eq2.getKind(), is(SqlKind.EQUALS));
+    final RexTableInputRef inputRef2 = (RexTableInputRef) eq2.getOperands().get(0);
+    assertThat(inputRef2.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef2.getIndex(), is(1));
+    final RexLiteral literal2 = (RexLiteral) eq2.getOperands().get(1);
+    assertThat(literal2.getValueAs(String.class), is("abc"));
+
+    // check 'deptno > 1'
+    final RexCall gt = (RexCall) and.getOperands().get(1);
+    assertThat(gt.getKind(), is(SqlKind.GREATER_THAN));
+    final RexTableInputRef inputRef3 = (RexTableInputRef) gt.getOperands().get(0);
+    assertThat(inputRef3.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef3.getIndex(), is(7));
+    final RexLiteral literal3 = (RexLiteral) gt.getOperands().get(1);
+    assertThat(literal3.getValueAs(Integer.class), is(1));

Review comment:
       I know that you are following the style of previous tests in this class and that is good but I find a bit verbose for what we really need. Instead of having 17 `assertThat` can't we just have only one based on the string representation of the expression?
   
   Something like the following:
   ```
   String actualExp = r.iterator().next().toString();
   assertEquals("AND([CATALOG, SALES, EMP].#0.$0...", actualExp);
   ```
   What do you think?

##########
File path: core/src/main/java/org/apache/calcite/rel/metadata/RelMdExpressionLineage.java
##########
@@ -493,9 +493,8 @@ private static void createExpressions(RexBuilder rexBuilder,
       Map<RexInputRef, RexNode> singleMapping, Set<RexNode> result) {
     if (mapping.isEmpty()) {
       final RexReplacer replacer = new RexReplacer(singleMapping);
-      final List<RexNode> updatedPreds = new ArrayList<>(
-          RelOptUtil.conjunctions(
-              rexBuilder.copy(expr)));
+      final List<RexNode> updatedPreds = new ArrayList<>();
+      updatedPreds.add(rexBuilder.copy(expr));

Review comment:
       Micro optim, probably insignificant but since the list is always gonna have one element so let's keep it like that:
   
   ```suggestion
         final List<RexNode> updatedPreds = new ArrayList<>(1);
         updatedPreds.add(rexBuilder.copy(expr));
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764919901



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2497,37 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    String expected = "[AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), "
+        + "=([CATALOG, SALES, EMP].#0.$1, 'abc')), "
+        + ">([CATALOG, SALES, EMP].#0.$7, 1))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"
+        + "'empno' is column 0 in 'catalog.sales.emp', "
+        + "'ename' is column 1 in 'catalog.sales.emp', and "
+        + "'deptno' is column 7 in 'catalog.sales.emp'";
+
+    assertExpressionLineage(sql, 0, expected, supplier);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    String sql = "select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno";
+    String expected = "[AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1),"
+        + " <=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 2))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"

Review comment:
       Ah, it looks like `BiFunction` and `Supplier` is not realy needed here. Just `String comment` argument is enough for `assertExpressionLineage`, and add the prefix in `assertExpressionLineage`

##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2497,37 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    String expected = "[AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), "
+        + "=([CATALOG, SALES, EMP].#0.$1, 'abc')), "
+        + ">([CATALOG, SALES, EMP].#0.$7, 1))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"
+        + "'empno' is column 0 in 'catalog.sales.emp', "
+        + "'ename' is column 1 in 'catalog.sales.emp', and "
+        + "'deptno' is column 7 in 'catalog.sales.emp'";
+
+    assertExpressionLineage(sql, 0, expected, supplier);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    String sql = "select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno";
+    String expected = "[AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1),"
+        + " <=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 2))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"

Review comment:
       Ah, it looks like `BiFunction` and `Supplier` is not really needed here. Just `String comment` argument is enough for `assertExpressionLineage`, and add the prefix in `assertExpressionLineage`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764708109



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));
+
+    final String actualExp = r.iterator().next().toString();
+    assertEquals("AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1),"
+        + " =([CATALOG, SALES, EMP].#0.$1, 'abc')),"
+        + " >([CATALOG, SALES, EMP].#0.$7, 1))", actualExp);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    // empno is column 0 in catalog.sales.emp
+    // deptno is column 0 in catalog.sales.dept
+    final RelNode rel = convertSql("select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));
+
+    // 'dept.deptno + empno between 1 and 2' is translated into
+    // 'dept.deptno + empno >= 1 and dept.deptno + empno <= 2'
+    final String actualExp = r.iterator().next().toString();
+    assertEquals("AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1),"

Review comment:
       The comments are cool, however, they won't be visible in assertion failure, and it would be better to put them into the assertion message. Then the person that faces the failure would understand the reasons better.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r765019272



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       `"\\n\"\n  +"` should probably be `"\\\\n\"\n  +"`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r762077681



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/materialize/MaterializedViewRule.java
##########
@@ -1036,21 +1037,36 @@ protected NodeLineage generateSwapTableColumnReferencesLineage(
     final Map<RexNode, Integer> exprsLineage = new HashMap<>();
     final Map<RexNode, Integer> exprsLineageLosslessCasts = new HashMap<>();
     for (int i = 0; i < nodeExprs.size(); i++) {
-      final Set<RexNode> s = mq.getExpressionLineage(node, nodeExprs.get(i));
-      if (s == null) {
+      final Set<RexNode> lineages = mq.getExpressionLineage(node, nodeExprs.get(i));
+      if (lineages == null) {
         // Next expression
         continue;
       }
-      // We only support project - filter - join, thus it should map to
-      // a single expression
-      assert s.size() == 1;
-      // Rewrite expr. First we swap the table references following the table
-      // mapping, then we take first element from the corresponding equivalence class
-      final RexNode e = RexUtil.swapTableColumnReferences(rexBuilder,
-          s.iterator().next(), tableMapping, ec.getEquivalenceClassesMap());
-      exprsLineage.put(e, i);
-      if (RexUtil.isLosslessCast(e)) {
-        exprsLineageLosslessCasts.put(((RexCall) e).getOperands().get(0), i);
+      final RexNode expr = nodeExprs.get(i);
+      if (SqlKind.AND == expr.getKind() || SqlKind.OR == expr.getKind()) {

Review comment:
       You are right, we can go for a more general solution and support all complex expressions like the one you mentioned. I will also add one or more tests for that.
   
   Regarding `mq.expressionsLineage` I tend to agree with you, it speaks of returning the `set of resulting expressions equivalent to the input expression`, but then it splits it into multiple expressions.
   
   If `mq.expressionsLineage` is broken then maybe we don't have an issue here at all in the first place. I can't see any unit tests for this part of the code, it's hard to tell the original intention behind the method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764650446



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2483,115 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageComplexExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    // check '(empno = 1 or ename = 'abc') and deptno > 1'
+    assertThat(r.size(), is(1));
+    final RexNode result = r.iterator().next();
+    assertThat(result.getKind(), is(SqlKind.AND));
+    final RexCall and = (RexCall) result;
+
+    // check '(empno = 1 or ename = 'abc')'
+    assertThat(and.getOperands().size(), is(2));
+    final RexCall or = (RexCall) and.getOperands().get(0);
+    assertThat(or.getKind(), is(SqlKind.OR));
+    assertThat(or.getOperands().size(), is(2));
+
+    // check 'empno = 1'
+    final RexCall eq1 = (RexCall) or.getOperands().get(0);
+    assertThat(eq1.getKind(), is(SqlKind.EQUALS));
+    final RexTableInputRef inputRef1 = (RexTableInputRef) eq1.getOperands().get(0);
+    assertThat(inputRef1.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef1.getIndex(), is(0));
+    final RexLiteral literal1 = (RexLiteral) eq1.getOperands().get(1);
+    assertThat(literal1.getValueAs(Integer.class), is(1));
+
+    // check 'ename = 'abc''
+    final RexCall eq2 = (RexCall) or.getOperands().get(1);
+    assertThat(eq2.getKind(), is(SqlKind.EQUALS));
+    final RexTableInputRef inputRef2 = (RexTableInputRef) eq2.getOperands().get(0);
+    assertThat(inputRef2.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef2.getIndex(), is(1));
+    final RexLiteral literal2 = (RexLiteral) eq2.getOperands().get(1);
+    assertThat(literal2.getValueAs(String.class), is("abc"));
+
+    // check 'deptno > 1'
+    final RexCall gt = (RexCall) and.getOperands().get(1);
+    assertThat(gt.getKind(), is(SqlKind.GREATER_THAN));
+    final RexTableInputRef inputRef3 = (RexTableInputRef) gt.getOperands().get(0);
+    assertThat(inputRef3.getQualifiedName(), is(EMP_QNAME));
+    assertThat(inputRef3.getIndex(), is(7));
+    final RexLiteral literal3 = (RexLiteral) gt.getOperands().get(1);
+    assertThat(literal3.getValueAs(Integer.class), is(1));

Review comment:
       As you say, I have just adapted to the style of the tests, but if breaking uniformity is an option I am more than happy to switch to string-based tests. The other format is hard to write, read and maintain, for no real added value IMO.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r765001237



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       Actually it was not my doing, it was `./gradlew autostyleApply` doing that:
   
   ```
        assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
   -        + rel + "'" + " for query '" + sql + "':\n" + comment);
   +        + rel + "'" + " for query '" + sql + "':n"
   +  + comment);
      }
   ```
   
   Looks like a bug to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r765023357



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2399,6 +2399,21 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(RelMdUtil.linear(12, 0, 10, 100, 200), is(200d));
   }
 
+  // ----------------------------------------------------------------------
+  // Tests for getExpressionLineage
+  // ----------------------------------------------------------------------
+
+  private void assertExpressionLineage(
+      String sql, int columnIndex, String expected, String comment) {
+    RelNode rel = convertSql(sql);
+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    RexNode ref = RexInputRef.of(columnIndex, rel.getRowType().getFieldList());
+    Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+
+    assertEquals(expected, String.valueOf(r), "Lineage for expr '" + ref + "' in node '"
+        + rel + "'" + " for query '" + sql + "':n" + comment);

Review comment:
       @asolimando , I believe the issue has little to do with autostyle, and it seems to be the issue with Calcite's customization.
   
   Please check Calcite-specific regex replacements (see build.gradle.kts#L542)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r763227407



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/materialize/MaterializedViewRule.java
##########
@@ -1036,21 +1037,36 @@ protected NodeLineage generateSwapTableColumnReferencesLineage(
     final Map<RexNode, Integer> exprsLineage = new HashMap<>();
     final Map<RexNode, Integer> exprsLineageLosslessCasts = new HashMap<>();
     for (int i = 0; i < nodeExprs.size(); i++) {
-      final Set<RexNode> s = mq.getExpressionLineage(node, nodeExprs.get(i));
-      if (s == null) {
+      final Set<RexNode> lineages = mq.getExpressionLineage(node, nodeExprs.get(i));
+      if (lineages == null) {
         // Next expression
         continue;
       }
-      // We only support project - filter - join, thus it should map to
-      // a single expression
-      assert s.size() == 1;
-      // Rewrite expr. First we swap the table references following the table
-      // mapping, then we take first element from the corresponding equivalence class
-      final RexNode e = RexUtil.swapTableColumnReferences(rexBuilder,
-          s.iterator().next(), tableMapping, ec.getEquivalenceClassesMap());
-      exprsLineage.put(e, i);
-      if (RexUtil.isLosslessCast(e)) {
-        exprsLineageLosslessCasts.put(((RexCall) e).getOperands().get(0), i);
+      final RexNode expr = nodeExprs.get(i);
+      if (SqlKind.AND == expr.getKind() || SqlKind.OR == expr.getKind()) {

Review comment:
       Finally my diagnosis is that `mq.expressionsLineage` was indeed broken, it was decomposing top `AND` expressions for no apparent good reason, see [RelMdExpressionLineage.java#L497](https://github.com/apache/calcite/blob/2317f4ea7779a38d128a756baa5ad21797f2eb6c/core/src/main/java/org/apache/calcite/rel/metadata/RelMdExpressionLineage.java#L497). After fixing that, no more need for additional handling in the caller.
   
   Luckily there were unit tests, just not where I expected them to be, I have managed to reproduce the issue in one of them, and double checked that the fix did handle it, and at the same time that nothing else broke.
   
   There was a unit test on "complex expressions", but it had a `+`, while the issue only occurs on `AND` as top expression.
   
   Please have another look @zabetak if you have some spare cycles, thanks a lot for the tip, I was heading down a slippery road :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764734047



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");

Review comment:
       I have added an assertion message, failure now look like this:
   ```
   For query 'select (empno = 1 or ename = 'abc') and deptno > 1 from emp':
   'empno' is column 0 in 'catalog.sales.emp', 'ename' is column 1 in 'catalog.sales.emp', and 'deptno' is column 7 in 'catalog.sales.emp' ==> expected: <AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1)> but was: <AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))>
   Expected :AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1)
   Actual   :AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764775481



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2483,46 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    final String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    final RelNode rel = convertSql(sql);
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = Objects.requireNonNull(mq.getExpressionLineage(rel, ref));
+
+    final String assertionMessage = "For query '" + sql + "':\n"
+        + "'empno' is column 0 in 'catalog.sales.emp', "
+        + "'ename' is column 1 in 'catalog.sales.emp', and "
+        + "'deptno' is column 7 in 'catalog.sales.emp'";
+
+    final String actualExp = Iterables.getOnlyElement(r).toString();
+    assertEquals("AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1),"
+        + " =([CATALOG, SALES, EMP].#0.$1, 'abc')),"
+        + " >([CATALOG, SALES, EMP].#0.$7, 1))", actualExp, assertionMessage);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    final String sql = "select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno";
+    final RelNode rel = convertSql(sql);
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = Objects.requireNonNull(mq.getExpressionLineage(rel, ref));

Review comment:
       This seems to be duplicated across methods, so it would be nice to extract it into `assertExpressionLineage(sql, columnIndex, Supplier<String> message)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asolimando commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764773164



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2482,41 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    // empno is column 0 in catalog.sales.emp
+    // ename is column 1 in catalog.sales.emp
+    // deptno is column 7 in catalog.sales.emp
+    final RelNode rel = convertSql("select (empno = 1 or ename = 'abc') and deptno > 1 from emp");
+    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+
+    final RexNode ref = RexInputRef.of(0, rel.getRowType().getFieldList());
+    final Set<RexNode> r = mq.getExpressionLineage(rel, ref);
+    assertThat(r.size(), is(1));

Review comment:
       I have merged the two assertions since they would be overlapping and provided a more comprehensive error message:
   
   ```
   Lineage for expr '$0' in node 'rel#2:LogicalProject.(input=LogicalTableScan#1,exprs=[AND(OR(=($0, 1), =($1, 'abc')), >($7, 1))])' for query 'select (empno = 1 or ename = 'abc') and deptno > 1 from emp':
   'empno' is column 0 in 'catalog.sales.emp', 'ename' is column 1 in 'catalog.sales.emp', and 'deptno' is column 7 in 'catalog.sales.emp' ==> expected: <AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))]> but was: <[AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))]>
   Expected :AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))]
   Actual   :[AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), =([CATALOG, SALES, EMP].#0.$1, 'abc')), >([CATALOG, SALES, EMP].#0.$7, 1))]
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r764919901



##########
File path: core/src/test/java/org/apache/calcite/test/RelMetadataTest.java
##########
@@ -2482,6 +2497,37 @@ private void checkPredicates(RelOptCluster cluster, RelOptTable empTable,
     assertThat(inputRef1.getIdentifier(), is(inputRef2.getIdentifier()));
   }
 
+  @Test void testExpressionLineageConjuntiveExpression() {
+    String sql = "select (empno = 1 or ename = 'abc') and deptno > 1 from emp";
+    String expected = "[AND(OR(=([CATALOG, SALES, EMP].#0.$0, 1), "
+        + "=([CATALOG, SALES, EMP].#0.$1, 'abc')), "
+        + ">([CATALOG, SALES, EMP].#0.$7, 1))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"
+        + "'empno' is column 0 in 'catalog.sales.emp', "
+        + "'ename' is column 1 in 'catalog.sales.emp', and "
+        + "'deptno' is column 7 in 'catalog.sales.emp'";
+
+    assertExpressionLineage(sql, 0, expected, supplier);
+  }
+
+  @Test void testExpressionLineageBetweenExpressionWithJoin() {
+    String sql = "select dept.deptno + empno between 1 and 2"
+        + " from emp join dept on emp.deptno = dept.deptno";
+    String expected = "[AND(>=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 1),"
+        + " <=(+([CATALOG, SALES, DEPT].#0.$0, [CATALOG, SALES, EMP].#0.$0), 2))]";
+
+    BiFunction<RexNode, RelNode, String> supplier = (ref, rel) -> "Lineage for expr '"
+        + ref + "' in node '" + rel + "'" + " for query '" + sql + "':\n"

Review comment:
       Ah, it looks like `BiFunction` and `Supplier` is not really needed here. Just `String comment` argument is enough for `assertExpressionLineage`, and `assertExpressionLineage` can add the prefix with ref, rel, and sql.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] zabetak commented on a change in pull request #2613: [CALCITE-4894] MV rewriting fails for expressions with more than a field reference in the view/query projection

Posted by GitBox <gi...@apache.org>.
zabetak commented on a change in pull request #2613:
URL: https://github.com/apache/calcite/pull/2613#discussion_r762056563



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/materialize/MaterializedViewRule.java
##########
@@ -1036,21 +1037,36 @@ protected NodeLineage generateSwapTableColumnReferencesLineage(
     final Map<RexNode, Integer> exprsLineage = new HashMap<>();
     final Map<RexNode, Integer> exprsLineageLosslessCasts = new HashMap<>();
     for (int i = 0; i < nodeExprs.size(); i++) {
-      final Set<RexNode> s = mq.getExpressionLineage(node, nodeExprs.get(i));
-      if (s == null) {
+      final Set<RexNode> lineages = mq.getExpressionLineage(node, nodeExprs.get(i));
+      if (lineages == null) {
         // Next expression
         continue;
       }
-      // We only support project - filter - join, thus it should map to
-      // a single expression
-      assert s.size() == 1;
-      // Rewrite expr. First we swap the table references following the table
-      // mapping, then we take first element from the corresponding equivalence class
-      final RexNode e = RexUtil.swapTableColumnReferences(rexBuilder,
-          s.iterator().next(), tableMapping, ec.getEquivalenceClassesMap());
-      exprsLineage.put(e, i);
-      if (RexUtil.isLosslessCast(e)) {
-        exprsLineageLosslessCasts.put(((RexCall) e).getOperands().get(0), i);
+      final RexNode expr = nodeExprs.get(i);
+      if (SqlKind.AND == expr.getKind() || SqlKind.OR == expr.getKind()) {

Review comment:
       Why do we need to restrict ourselves to AND/OR expressions? I suppose the same problem can re-appear easily if we have other kind of expressions e.g., `s.time_id + s.time_id`. Can't we apply a generic solution here?
   
   Moreover, I am not sure that `mq.expressionsLineage` works as expected. I checked the javadoc of `RelMdExpressionLineage#createAllPossibleCombinations` and it is supposed to return an equivalent expression with the one given as input. If the input is `A > 1 AND A <3 ` why in the output  we have `A > 1`, `A<3` decomposed. Is this normal?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org