You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2019/12/28 21:58:45 UTC

[GitHub] [calcite] vlsi opened a new pull request #1702: Use concurrent test execution by default

vlsi opened a new pull request #1702: Use concurrent test execution by default
URL: https://github.com/apache/calcite/pull/1702
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1702: Patch queue: refine optimizer estimations, refine rules, normalize RexNodes, ...

Posted by GitBox <gi...@apache.org>.

danny0405 commented on a change in pull request #1702: Patch queue: refine optimizer estimations, refine rules, normalize RexNodes, ...
URL: https://github.com/apache/calcite/pull/1702#discussion_r365120817
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableTableScan.java
 ##########
 @@ -84,15 +90,59 @@ public static EnumerableTableScan create(RelOptCluster cluster,
   }
 
   /** Returns whether EnumerableTableScan can generate code to handle a
-   * particular variant of the Table SPI. */
+   * particular variant of the Table SPI.
+   * @deprecated
+   **/
+  @Deprecated
   public static boolean canHandle(Table table) {
-    // FilterableTable and ProjectableFilterableTable cannot be handled in
-    // enumerable convention because they might reject filters and those filters
-    // would need to be handled dynamically.
+    if (table instanceof TransientTable) {
+      // CALCITE-3673: TransientTable can't be implemented with Enumerable
+      return false;
+    }
+    // See org.apache.calcite.prepare.RelOptTableImpl.getClassExpressionFunction
     return table instanceof QueryableTable
+        || table instanceof FilterableTable
+        || table instanceof ProjectableFilterableTable
         || table instanceof ScannableTable;
   }
 
+  /** Returns whether EnumerableTableScan can generate code to handle a
+   * particular variant of the Table SPI.
+   **/
+  public static boolean canHandle(RelOptTable relOptTable) {
+    Table table = relOptTable.unwrap(Table.class);
+    if (table != null && !canHandle(table)) {
+      return false;
+    }
+    boolean supportArray = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_ARRAY.value();
+    boolean supportMap = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_MAP.value();
+    boolean supportMultiset = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_MULTISET.value();
+    if (supportArray && supportMap && supportMultiset) {
+      return true;
+    }
+    // Struct fields are not supported in EnumerableTableScan
+    for (RelDataTypeField field : relOptTable.getRowType().getFieldList()) {
+      boolean unsupportedType = false;
+      switch (field.getType().getSqlTypeName()) {
+      case ARRAY:
+        unsupportedType = supportArray;
+        break;
+      case MAP:
 
 Review comment:
   Better move the type decision to `SqlTypeUtil`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1702: Use concurrent test execution by default

Posted by GitBox <gi...@apache.org>.

rubenada commented on a change in pull request #1702: Use concurrent test execution by default
URL: https://github.com/apache/calcite/pull/1702#discussion_r363195740
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableHashJoin.java
 ##########
 @@ -101,44 +101,21 @@ public static EnumerableHashJoin create(
 
   @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
       RelMetadataQuery mq) {
-    double rowCount = mq.getRowCount(this);
-
-    // Joins can be flipped, and for many algorithms, both versions are viable
-    // and have the same cost. To make the results stable between versions of
-    // the planner, make one of the versions slightly more expensive.
-    switch (joinType) {
-    case SEMI:
-    case ANTI:
-      // SEMI and ANTI join cannot be flipped
-      break;
-    case RIGHT:
-      rowCount = RelMdUtil.addEpsilon(rowCount);
-      break;
-    default:
-      if (RelNodes.COMPARATOR.compare(left, right) > 0) {
-        rowCount = RelMdUtil.addEpsilon(rowCount);
-      }
-    }
-
-    // Cheaper if the smaller number of rows is coming from the LHS.
-    // Model this by adding L log L to the cost.
+    RelOptCostFactory costFactory = planner.getCostFactory();
     final double rightRowCount = right.estimateRowCount(mq);
 
 Review comment:
   First of all, thanks for you huge effort on this PR, @vlsi 
   I have a small comment here, according to `RelNode#estimateRowCount` javadoc: _"don't call this method directly. Instead, use RelMetadataQuery#getRowCount, which gives plugins a chance to override the rel's default ideas about row count"._ I wonder if we could use this PR to replace:
   ```
   final double rightRowCount = right.estimateRowCount(mq);
   final double leftRowCount = left.estimateRowCount(mq);
   ```
   with
   ```
   final double rightRowCount = mq.getRowCount(right);
   final double leftRowCount = mq.getRowCount(left);
   ```
   which seems more correct, according to that javadoc comment. We could do that in `computeSelfCost` method of EnumerableHashJoin, EnumerableMergeJoin, EnumerableNestedLoopJoin, EnumerableBatchNestedLoopJoin and Correlate (which were already impacted by the PR). What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1702: Use concurrent test execution by default

Posted by GitBox <gi...@apache.org>.

vlsi commented on a change in pull request #1702: Use concurrent test execution by default
URL: https://github.com/apache/calcite/pull/1702#discussion_r363213605
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableHashJoin.java
 ##########
 @@ -101,44 +101,21 @@ public static EnumerableHashJoin create(
 
   @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
       RelMetadataQuery mq) {
-    double rowCount = mq.getRowCount(this);
-
-    // Joins can be flipped, and for many algorithms, both versions are viable
-    // and have the same cost. To make the results stable between versions of
-    // the planner, make one of the versions slightly more expensive.
-    switch (joinType) {
-    case SEMI:
-    case ANTI:
-      // SEMI and ANTI join cannot be flipped
-      break;
-    case RIGHT:
-      rowCount = RelMdUtil.addEpsilon(rowCount);
-      break;
-    default:
-      if (RelNodes.COMPARATOR.compare(left, right) > 0) {
-        rowCount = RelMdUtil.addEpsilon(rowCount);
-      }
-    }
-
-    // Cheaper if the smaller number of rows is coming from the LHS.
-    // Model this by adding L log L to the cost.
+    RelOptCostFactory costFactory = planner.getCostFactory();
     final double rightRowCount = right.estimateRowCount(mq);
 
 Review comment:
   Thanks. It makes sense indeed. Will replace shortly

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1702: Use concurrent test execution by default

Posted by GitBox <gi...@apache.org>.

rubenada commented on a change in pull request #1702: Use concurrent test execution by default
URL: https://github.com/apache/calcite/pull/1702#discussion_r363195740
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableHashJoin.java
 ##########
 @@ -101,44 +101,21 @@ public static EnumerableHashJoin create(
 
   @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
       RelMetadataQuery mq) {
-    double rowCount = mq.getRowCount(this);
-
-    // Joins can be flipped, and for many algorithms, both versions are viable
-    // and have the same cost. To make the results stable between versions of
-    // the planner, make one of the versions slightly more expensive.
-    switch (joinType) {
-    case SEMI:
-    case ANTI:
-      // SEMI and ANTI join cannot be flipped
-      break;
-    case RIGHT:
-      rowCount = RelMdUtil.addEpsilon(rowCount);
-      break;
-    default:
-      if (RelNodes.COMPARATOR.compare(left, right) > 0) {
-        rowCount = RelMdUtil.addEpsilon(rowCount);
-      }
-    }
-
-    // Cheaper if the smaller number of rows is coming from the LHS.
-    // Model this by adding L log L to the cost.
+    RelOptCostFactory costFactory = planner.getCostFactory();
     final double rightRowCount = right.estimateRowCount(mq);
 
 Review comment:
   First of all, thanks for you huge effort on this PR, @vlsi 
   I have a small comment here, according to RelNode#estimateRowCount javadoc: _"don't call this method directly. Instead, use RelMetadataQuery#getRowCount, which gives plugins a chance to override the rel's default ideas about row count"._ I wonder if we could use this PR to replace:
   ```
   final double rightRowCount = right.estimateRowCount(mq);
   final double leftRowCount = left.estimateRowCount(mq);
   ```
   with
   ```
   final double rightRowCount = mq.getRowCount(right);
   final double leftRowCount = mq.getRowCount(left);
   ```
   which seems more correct, according to that javadoc comment. We could do that in EnumerableHashJoin, EnumerableMergeJoin, EnumerableNestedLoopJoin, EnumerableBatchNestedLoopJoin and Correlate (which were already impacted by the PR). What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1702: Patch queue: refine optimizer estimations, refine rules, normalize RexNodes, ...

Posted by GitBox <gi...@apache.org>.

vlsi commented on a change in pull request #1702: Patch queue: refine optimizer estimations, refine rules, normalize RexNodes, ...
URL: https://github.com/apache/calcite/pull/1702#discussion_r365123653
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableTableScan.java
 ##########
 @@ -84,15 +90,59 @@ public static EnumerableTableScan create(RelOptCluster cluster,
   }
 
   /** Returns whether EnumerableTableScan can generate code to handle a
-   * particular variant of the Table SPI. */
+   * particular variant of the Table SPI.
+   * @deprecated
+   **/
+  @Deprecated
   public static boolean canHandle(Table table) {
-    // FilterableTable and ProjectableFilterableTable cannot be handled in
-    // enumerable convention because they might reject filters and those filters
-    // would need to be handled dynamically.
+    if (table instanceof TransientTable) {
+      // CALCITE-3673: TransientTable can't be implemented with Enumerable
+      return false;
+    }
+    // See org.apache.calcite.prepare.RelOptTableImpl.getClassExpressionFunction
     return table instanceof QueryableTable
+        || table instanceof FilterableTable
+        || table instanceof ProjectableFilterableTable
         || table instanceof ScannableTable;
   }
 
+  /** Returns whether EnumerableTableScan can generate code to handle a
+   * particular variant of the Table SPI.
+   **/
+  public static boolean canHandle(RelOptTable relOptTable) {
+    Table table = relOptTable.unwrap(Table.class);
+    if (table != null && !canHandle(table)) {
+      return false;
+    }
+    boolean supportArray = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_ARRAY.value();
+    boolean supportMap = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_MAP.value();
+    boolean supportMultiset = CalciteSystemProperty.ENUMERABLE_ENABLE_TABLESCAN_MULTISET.value();
+    if (supportArray && supportMap && supportMultiset) {
+      return true;
+    }
+    // Struct fields are not supported in EnumerableTableScan
+    for (RelDataTypeField field : relOptTable.getRowType().getFieldList()) {
+      boolean unsupportedType = false;
+      switch (field.getType().getSqlTypeName()) {
+      case ARRAY:
+        unsupportedType = supportArray;
+        break;
+      case MAP:
 
 Review comment:
   I guess this is already in master.
   Will rebase this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services