You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2020/08/16 20:10:05 UTC

[GitHub] [calcite] julianhyde opened a new pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

julianhyde opened a new pull request #2111:
URL: https://github.com/apache/calcite/pull/2111


   Add SQL function Hilbert. It uses Google's uzaygezen library,
   and our wraper code is copied from LocationTech's SFCurve
   project.
   
   Rewrite calls to ST_DWithin to use a range of on a Hilbert
   column, plus the call to ST_DWithin for safety.
   
   Implement function ST_MakeEnvelope.
   
   Use constant reduction to recognize constant geometry
   expressions before we apply SpatialRules.
   
   Move interface Geom, and other classes, from GeoFunctions into
   new utility class Geometries.
   
   Geometry literals (not in the SQL parser (yet), but in
   RexNode space).
   
   Make Geom implement Comparable, so that it can be a literal
   value.
   
   Move SqlStdOperatorTables to new package, and rename to
   SqlOperatorTables.
   
   Add RelOptTestBase.Sql.withCatalogReaderFactory and
   withConformance, to make it easier to run planner tests with
   a different schema or SQL dialect.
   
   Deprecate RelReferentialConstraint.getNumColumns().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474198197



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       However, this check is crucial to the algorithm. Without it, the algorithm loops. The algorithm is looking for artifacts that it created, so looking for a very particular pattern isn't so bad.
   
   Maybe in future we can make more use of predicates rather than looking for Hilbert specifically. My work on sargs should allow us to identify whether the relation has already been filtered.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474268042



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       I think we're talking about different checks. I understand the check for the Hilbert predicate is necessary to avoid infinite loops. (My student ran into the same problem.) My point was that it would be nice to be able to get the "Hilbert column" to use during rewriting without relying on the predicate with the Hilbert function already existing in the query.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r476411802



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -285,66 +286,95 @@ public static SqlOperatorTable operatorTable(String... classNames) {
           className, "*", true);
     }
 
-    // The following is technical debt; see [CALCITE-2082] Remove
-    // RelDataTypeFactory argument from SqlUserDefinedAggFunction constructor
-    final SqlTypeFactoryImpl typeFactory =
-        new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
-
     final ListSqlOperatorTable table = new ListSqlOperatorTable();
     for (String name : schema.getFunctionNames()) {
-      for (Function function : schema.getFunctions(name, true)) {
+      schema.getFunctions(name, true).forEach(function -> {
         final SqlIdentifier id = new SqlIdentifier(name, SqlParserPos.ZERO);
-        table.add(
-            toOp(typeFactory, id, function));
-      }
+        table.add(toOp(id, function));
+      });
     }
     return table;
   }
 
-  private SqlOperator toOp(SqlIdentifier name, final Function function) {
-    return toOp(typeFactory, name, function);
-  }
-
-  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}.
-   *
-   * <p>The {@code typeFactory} argument is technical debt; see [CALCITE-2082]
-   * Remove RelDataTypeFactory argument from SqlUserDefinedAggFunction
-   * constructor. */
-  private static SqlOperator toOp(RelDataTypeFactory typeFactory,
-      SqlIdentifier name, final Function function) {
-    List<RelDataType> argTypes = new ArrayList<>();
-    List<SqlTypeFamily> typeFamilies = new ArrayList<>();
-    for (FunctionParameter o : function.getParameters()) {
-      final RelDataType type = o.getType(typeFactory);
-      argTypes.add(type);
-      typeFamilies.add(
-          Util.first(type.getSqlTypeName().getFamily(), SqlTypeFamily.ANY));
-    }
-    final FamilyOperandTypeChecker typeChecker =
-        OperandTypes.family(typeFamilies, i ->
-            function.getParameters().get(i).isOptional());
-    final List<RelDataType> paramTypes = toSql(typeFactory, argTypes);
+  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}. */
+  private static SqlOperator toOp(SqlIdentifier name,
+      final org.apache.calcite.schema.Function function) {
+    final Function<RelDataTypeFactory, List<RelDataType>> argTypesFactory =
+        typeFactory -> function.getParameters()
+            .stream()
+            .map(o -> o.getType(typeFactory))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<SqlTypeFamily>> typeFamiliesFactory =
+        typeFactory -> argTypesFactory.apply(typeFactory)
+            .stream()
+            .map(type ->
+                Util.first(type.getSqlTypeName().getFamily(),
+                    SqlTypeFamily.ANY))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<RelDataType>> paramTypesFactory =
+        typeFactory ->
+            argTypesFactory.apply(typeFactory)
+                .stream()
+                .map(type -> toSql(typeFactory, type))
+                .collect(Util.toImmutableList());
+
+    // Use a short-lived type factory to populate "typeFamilies" and "argTypes".
+    // SqlOperandMetadata.paramTypes will use the real type factory, during
+    // validation.
+    final RelDataTypeFactory dummyTypeFactory = new JavaTypeFactoryImpl();
+    final List<RelDataType> argTypes = argTypesFactory.apply(dummyTypeFactory);
+    final List<SqlTypeFamily> typeFamilies =
+        typeFamiliesFactory.apply(dummyTypeFactory);
+
+    final SqlOperandTypeInference operandTypeInference =
+        InferTypes.explicit(argTypes);
+
+    final SqlOperandMetadata operandMetadata =
+        OperandTypes.operandMetadata(typeFamilies, paramTypesFactory,
+            i -> function.getParameters().get(i).getName(),
+            i -> function.getParameters().get(i).isOptional());
+
+    final SqlKind kind = kind(function);
     if (function instanceof ScalarFunction) {
-      return new SqlUserDefinedFunction(name, infer((ScalarFunction) function),
-          InferTypes.explicit(argTypes), typeChecker, paramTypes, function);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((ScalarFunction) function);
+      return new SqlUserDefinedFunction(name, kind, returnTypeInference,
+          operandTypeInference, operandMetadata, function);
     } else if (function instanceof AggregateFunction) {
-      return new SqlUserDefinedAggFunction(name,
-          infer((AggregateFunction) function), InferTypes.explicit(argTypes),
-          typeChecker, (AggregateFunction) function, false, false,
-          Optionality.FORBIDDEN, typeFactory);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((AggregateFunction) function);
+      return new SqlUserDefinedAggFunction(name, kind,
+          returnTypeInference, operandTypeInference,
+          operandMetadata, (AggregateFunction) function, false, false,
+          Optionality.FORBIDDEN);
     } else if (function instanceof TableMacro) {
-      return new SqlUserDefinedTableMacro(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableMacro) function);
+      return new SqlUserDefinedTableMacro(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableMacro) function);
     } else if (function instanceof TableFunction) {
-      return new SqlUserDefinedTableFunction(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableFunction) function);
+      return new SqlUserDefinedTableFunction(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableFunction) function);
     } else {
       throw new AssertionError("unknown function type " + function);
     }
   }
 
+  /** Deduces the {@link org.apache.calcite.sql.SqlKind} of a user-defined
+   * function based on a {@link Hints} annotation, if present. */
+  private static SqlKind kind(org.apache.calcite.schema.Function function) {
+    if (function instanceof ScalarFunctionImpl) {
+      Hints hints =
+          ((ScalarFunctionImpl) function).method.getAnnotation(Hints.class);
+      if (hints != null) {
+        for (String hint : hints.value()) {
+          if (hint.startsWith("SqlKind:")) {
+            return SqlKind.valueOf(hint.substring("SqlKind:".length()));

Review comment:
       That sounds reasonable to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r475008814



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       No, this check is not whether Hilbert is called in the query. It is whether a column `column = Hilbert(x, y)` is in the constraints. Which is what happens when you have defined a calculated column in your `CREATE TABLE` statement. (It will also happen if you have previously applied `Filter(i = Hilbert(x, y))` but that's less interesting.)
   
   Remember, this rewrite requires a table that has a Hilbert computed column and is sorted on that column. The query that uses, say, `ST_DWithin`, will be rewritten to use it, even if it does not mention the `Hilbert` function.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474202291



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -285,66 +286,95 @@ public static SqlOperatorTable operatorTable(String... classNames) {
           className, "*", true);
     }
 
-    // The following is technical debt; see [CALCITE-2082] Remove
-    // RelDataTypeFactory argument from SqlUserDefinedAggFunction constructor
-    final SqlTypeFactoryImpl typeFactory =
-        new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
-
     final ListSqlOperatorTable table = new ListSqlOperatorTable();
     for (String name : schema.getFunctionNames()) {
-      for (Function function : schema.getFunctions(name, true)) {
+      schema.getFunctions(name, true).forEach(function -> {
         final SqlIdentifier id = new SqlIdentifier(name, SqlParserPos.ZERO);
-        table.add(
-            toOp(typeFactory, id, function));
-      }
+        table.add(toOp(id, function));
+      });
     }
     return table;
   }
 
-  private SqlOperator toOp(SqlIdentifier name, final Function function) {
-    return toOp(typeFactory, name, function);
-  }
-
-  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}.
-   *
-   * <p>The {@code typeFactory} argument is technical debt; see [CALCITE-2082]
-   * Remove RelDataTypeFactory argument from SqlUserDefinedAggFunction
-   * constructor. */
-  private static SqlOperator toOp(RelDataTypeFactory typeFactory,
-      SqlIdentifier name, final Function function) {
-    List<RelDataType> argTypes = new ArrayList<>();
-    List<SqlTypeFamily> typeFamilies = new ArrayList<>();
-    for (FunctionParameter o : function.getParameters()) {
-      final RelDataType type = o.getType(typeFactory);
-      argTypes.add(type);
-      typeFamilies.add(
-          Util.first(type.getSqlTypeName().getFamily(), SqlTypeFamily.ANY));
-    }
-    final FamilyOperandTypeChecker typeChecker =
-        OperandTypes.family(typeFamilies, i ->
-            function.getParameters().get(i).isOptional());
-    final List<RelDataType> paramTypes = toSql(typeFactory, argTypes);
+  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}. */
+  private static SqlOperator toOp(SqlIdentifier name,
+      final org.apache.calcite.schema.Function function) {
+    final Function<RelDataTypeFactory, List<RelDataType>> argTypesFactory =
+        typeFactory -> function.getParameters()
+            .stream()
+            .map(o -> o.getType(typeFactory))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<SqlTypeFamily>> typeFamiliesFactory =
+        typeFactory -> argTypesFactory.apply(typeFactory)
+            .stream()
+            .map(type ->
+                Util.first(type.getSqlTypeName().getFamily(),
+                    SqlTypeFamily.ANY))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<RelDataType>> paramTypesFactory =
+        typeFactory ->
+            argTypesFactory.apply(typeFactory)
+                .stream()
+                .map(type -> toSql(typeFactory, type))
+                .collect(Util.toImmutableList());
+
+    // Use a short-lived type factory to populate "typeFamilies" and "argTypes".
+    // SqlOperandMetadata.paramTypes will use the real type factory, during
+    // validation.
+    final RelDataTypeFactory dummyTypeFactory = new JavaTypeFactoryImpl();
+    final List<RelDataType> argTypes = argTypesFactory.apply(dummyTypeFactory);
+    final List<SqlTypeFamily> typeFamilies =
+        typeFamiliesFactory.apply(dummyTypeFactory);
+
+    final SqlOperandTypeInference operandTypeInference =
+        InferTypes.explicit(argTypes);
+
+    final SqlOperandMetadata operandMetadata =
+        OperandTypes.operandMetadata(typeFamilies, paramTypesFactory,
+            i -> function.getParameters().get(i).getName(),
+            i -> function.getParameters().get(i).isOptional());
+
+    final SqlKind kind = kind(function);
     if (function instanceof ScalarFunction) {
-      return new SqlUserDefinedFunction(name, infer((ScalarFunction) function),
-          InferTypes.explicit(argTypes), typeChecker, paramTypes, function);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((ScalarFunction) function);
+      return new SqlUserDefinedFunction(name, kind, returnTypeInference,
+          operandTypeInference, operandMetadata, function);
     } else if (function instanceof AggregateFunction) {
-      return new SqlUserDefinedAggFunction(name,
-          infer((AggregateFunction) function), InferTypes.explicit(argTypes),
-          typeChecker, (AggregateFunction) function, false, false,
-          Optionality.FORBIDDEN, typeFactory);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((AggregateFunction) function);
+      return new SqlUserDefinedAggFunction(name, kind,
+          returnTypeInference, operandTypeInference,
+          operandMetadata, (AggregateFunction) function, false, false,
+          Optionality.FORBIDDEN);
     } else if (function instanceof TableMacro) {
-      return new SqlUserDefinedTableMacro(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableMacro) function);
+      return new SqlUserDefinedTableMacro(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableMacro) function);
     } else if (function instanceof TableFunction) {
-      return new SqlUserDefinedTableFunction(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableFunction) function);
+      return new SqlUserDefinedTableFunction(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableFunction) function);
     } else {
       throw new AssertionError("unknown function type " + function);
     }
   }
 
+  /** Deduces the {@link org.apache.calcite.sql.SqlKind} of a user-defined
+   * function based on a {@link Hints} annotation, if present. */
+  private static SqlKind kind(org.apache.calcite.schema.Function function) {
+    if (function instanceof ScalarFunctionImpl) {
+      Hints hints =
+          ((ScalarFunctionImpl) function).method.getAnnotation(Hints.class);
+      if (hints != null) {
+        for (String hint : hints.value()) {
+          if (hint.startsWith("SqlKind:")) {
+            return SqlKind.valueOf(hint.substring("SqlKind:".length()));

Review comment:
       The `Hints` annotation is marked experimental, so we can change it without notice if it is not fit for purpose. For now, the free-format string is good for the mostly-internal current uses.
   
   If we made it more rigidly typed it would increase coupling.
   
   The purpose here was to allow functions such as `Hilbert` to be defined using the UDF framework but still recognized by internal code. In other words, looser coupling than we could do previously.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asfgit closed pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #2111:
URL: https://github.com/apache/calcite/pull/2111


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r476237132



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -285,66 +286,95 @@ public static SqlOperatorTable operatorTable(String... classNames) {
           className, "*", true);
     }
 
-    // The following is technical debt; see [CALCITE-2082] Remove
-    // RelDataTypeFactory argument from SqlUserDefinedAggFunction constructor
-    final SqlTypeFactoryImpl typeFactory =
-        new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
-
     final ListSqlOperatorTable table = new ListSqlOperatorTable();
     for (String name : schema.getFunctionNames()) {
-      for (Function function : schema.getFunctions(name, true)) {
+      schema.getFunctions(name, true).forEach(function -> {
         final SqlIdentifier id = new SqlIdentifier(name, SqlParserPos.ZERO);
-        table.add(
-            toOp(typeFactory, id, function));
-      }
+        table.add(toOp(id, function));
+      });
     }
     return table;
   }
 
-  private SqlOperator toOp(SqlIdentifier name, final Function function) {
-    return toOp(typeFactory, name, function);
-  }
-
-  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}.
-   *
-   * <p>The {@code typeFactory} argument is technical debt; see [CALCITE-2082]
-   * Remove RelDataTypeFactory argument from SqlUserDefinedAggFunction
-   * constructor. */
-  private static SqlOperator toOp(RelDataTypeFactory typeFactory,
-      SqlIdentifier name, final Function function) {
-    List<RelDataType> argTypes = new ArrayList<>();
-    List<SqlTypeFamily> typeFamilies = new ArrayList<>();
-    for (FunctionParameter o : function.getParameters()) {
-      final RelDataType type = o.getType(typeFactory);
-      argTypes.add(type);
-      typeFamilies.add(
-          Util.first(type.getSqlTypeName().getFamily(), SqlTypeFamily.ANY));
-    }
-    final FamilyOperandTypeChecker typeChecker =
-        OperandTypes.family(typeFamilies, i ->
-            function.getParameters().get(i).isOptional());
-    final List<RelDataType> paramTypes = toSql(typeFactory, argTypes);
+  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}. */
+  private static SqlOperator toOp(SqlIdentifier name,
+      final org.apache.calcite.schema.Function function) {
+    final Function<RelDataTypeFactory, List<RelDataType>> argTypesFactory =
+        typeFactory -> function.getParameters()
+            .stream()
+            .map(o -> o.getType(typeFactory))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<SqlTypeFamily>> typeFamiliesFactory =
+        typeFactory -> argTypesFactory.apply(typeFactory)
+            .stream()
+            .map(type ->
+                Util.first(type.getSqlTypeName().getFamily(),
+                    SqlTypeFamily.ANY))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<RelDataType>> paramTypesFactory =
+        typeFactory ->
+            argTypesFactory.apply(typeFactory)
+                .stream()
+                .map(type -> toSql(typeFactory, type))
+                .collect(Util.toImmutableList());
+
+    // Use a short-lived type factory to populate "typeFamilies" and "argTypes".
+    // SqlOperandMetadata.paramTypes will use the real type factory, during
+    // validation.
+    final RelDataTypeFactory dummyTypeFactory = new JavaTypeFactoryImpl();
+    final List<RelDataType> argTypes = argTypesFactory.apply(dummyTypeFactory);
+    final List<SqlTypeFamily> typeFamilies =
+        typeFamiliesFactory.apply(dummyTypeFactory);
+
+    final SqlOperandTypeInference operandTypeInference =
+        InferTypes.explicit(argTypes);
+
+    final SqlOperandMetadata operandMetadata =
+        OperandTypes.operandMetadata(typeFamilies, paramTypesFactory,
+            i -> function.getParameters().get(i).getName(),
+            i -> function.getParameters().get(i).isOptional());
+
+    final SqlKind kind = kind(function);
     if (function instanceof ScalarFunction) {
-      return new SqlUserDefinedFunction(name, infer((ScalarFunction) function),
-          InferTypes.explicit(argTypes), typeChecker, paramTypes, function);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((ScalarFunction) function);
+      return new SqlUserDefinedFunction(name, kind, returnTypeInference,
+          operandTypeInference, operandMetadata, function);
     } else if (function instanceof AggregateFunction) {
-      return new SqlUserDefinedAggFunction(name,
-          infer((AggregateFunction) function), InferTypes.explicit(argTypes),
-          typeChecker, (AggregateFunction) function, false, false,
-          Optionality.FORBIDDEN, typeFactory);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((AggregateFunction) function);
+      return new SqlUserDefinedAggFunction(name, kind,
+          returnTypeInference, operandTypeInference,
+          operandMetadata, (AggregateFunction) function, false, false,
+          Optionality.FORBIDDEN);
     } else if (function instanceof TableMacro) {
-      return new SqlUserDefinedTableMacro(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableMacro) function);
+      return new SqlUserDefinedTableMacro(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableMacro) function);
     } else if (function instanceof TableFunction) {
-      return new SqlUserDefinedTableFunction(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableFunction) function);
+      return new SqlUserDefinedTableFunction(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableFunction) function);
     } else {
       throw new AssertionError("unknown function type " + function);
     }
   }
 
+  /** Deduces the {@link org.apache.calcite.sql.SqlKind} of a user-defined
+   * function based on a {@link Hints} annotation, if present. */
+  private static SqlKind kind(org.apache.calcite.schema.Function function) {
+    if (function instanceof ScalarFunctionImpl) {
+      Hints hints =
+          ((ScalarFunctionImpl) function).method.getAnnotation(Hints.class);
+      if (hints != null) {
+        for (String hint : hints.value()) {
+          if (hint.startsWith("SqlKind:")) {
+            return SqlKind.valueOf(hint.substring("SqlKind:".length()));

Review comment:
       Yeah, fair point. I didn't want to overthink this, so I marked it 'experimental'.
   
   Maybe you're right that we should more strongly type the hint. If so, we will iterate in that direction.
   
   I think you'll agree that if someone wants a function with a particular SqlKind, it's better to allow them to write it as a UDF (and have some means, such as annotation, to link the function to the SqlKind) than to require them to sub-class SqlFunction. That's all I meant by 'less coupling'.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474268598



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);
+            final RexCall hilbert = (RexCall) eqCall.operands.get(1);
+            final RexUtil.RexFinder finder = RexUtil.find(ref);
+            if (finder.anyContain(conjunctions)) {
+              // If the condition already contains "ref", it is probable that
+              // this rule has already fired once.
+              continue;
+            }
+            for (int i = 0; i < conjunctions.size();) {
+              final List<RexNode> replacements =
+                  replaceSpatial(conjunctions.get(i), builder, ref, hilbert);
+              if (replacements != null) {
+                conjunctions.remove(i);
+                conjunctions.addAll(i, replacements);
+                i += replacements.size();
+                ++changeCount;
+              } else {
+                ++i;
+              }
+            }
+          }
+        }
+        if (changeCount > 0) {
+          call.transformTo(
+              builder.push(filter.getInput())
+                  .filter(conjunctions)
+                  .build());
+          return; // we found one useful constraint; don't look for more

Review comment:
       The end goal makes sense but won't the check for an existing predicate stop the optimization from being applied multiple times in this case?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474196095



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);
+            final RexCall hilbert = (RexCall) eqCall.operands.get(1);
+            final RexUtil.RexFinder finder = RexUtil.find(ref);
+            if (finder.anyContain(conjunctions)) {
+              // If the condition already contains "ref", it is probable that
+              // this rule has already fired once.
+              continue;
+            }
+            for (int i = 0; i < conjunctions.size();) {
+              final List<RexNode> replacements =
+                  replaceSpatial(conjunctions.get(i), builder, ref, hilbert);
+              if (replacements != null) {
+                conjunctions.remove(i);
+                conjunctions.addAll(i, replacements);
+                i += replacements.size();
+                ++changeCount;
+              } else {
+                ++i;
+              }
+            }
+          }
+        }
+        if (changeCount > 0) {
+          call.transformTo(
+              builder.push(filter.getInput())
+                  .filter(conjunctions)
+                  .build());
+          return; // we found one useful constraint; don't look for more

Review comment:
       As a general principle, if there are multiple function calls that can be optimized, we'd like the rule once for each, rather than trying to do too much in one shot. 
   
   We haven't really thought through complex cases for spatial optimizations. Maybe that can be follow-up work. For this change, I'm trying to make small rewrites that will (hopefully) compose.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r475008814



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       No, this check is not whether Hilbert is called in the query. It is whether a term `column = Hilbert(x, y)` is in the constraints. Which is what happens when you have defined a calculated column in your `CREATE TABLE` statement. (It will also happen if you have previously applied `Filter(i = Hilbert(x, y))` but that's less interesting.)
   
   Remember, this rewrite requires a table that has a Hilbert computed column and is sorted on that column. The query that uses, say, `ST_DWithin`, will be rewritten to use it, even if it does not mention the `Hilbert` function.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r475101004



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -285,66 +286,95 @@ public static SqlOperatorTable operatorTable(String... classNames) {
           className, "*", true);
     }
 
-    // The following is technical debt; see [CALCITE-2082] Remove
-    // RelDataTypeFactory argument from SqlUserDefinedAggFunction constructor
-    final SqlTypeFactoryImpl typeFactory =
-        new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
-
     final ListSqlOperatorTable table = new ListSqlOperatorTable();
     for (String name : schema.getFunctionNames()) {
-      for (Function function : schema.getFunctions(name, true)) {
+      schema.getFunctions(name, true).forEach(function -> {
         final SqlIdentifier id = new SqlIdentifier(name, SqlParserPos.ZERO);
-        table.add(
-            toOp(typeFactory, id, function));
-      }
+        table.add(toOp(id, function));
+      });
     }
     return table;
   }
 
-  private SqlOperator toOp(SqlIdentifier name, final Function function) {
-    return toOp(typeFactory, name, function);
-  }
-
-  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}.
-   *
-   * <p>The {@code typeFactory} argument is technical debt; see [CALCITE-2082]
-   * Remove RelDataTypeFactory argument from SqlUserDefinedAggFunction
-   * constructor. */
-  private static SqlOperator toOp(RelDataTypeFactory typeFactory,
-      SqlIdentifier name, final Function function) {
-    List<RelDataType> argTypes = new ArrayList<>();
-    List<SqlTypeFamily> typeFamilies = new ArrayList<>();
-    for (FunctionParameter o : function.getParameters()) {
-      final RelDataType type = o.getType(typeFactory);
-      argTypes.add(type);
-      typeFamilies.add(
-          Util.first(type.getSqlTypeName().getFamily(), SqlTypeFamily.ANY));
-    }
-    final FamilyOperandTypeChecker typeChecker =
-        OperandTypes.family(typeFamilies, i ->
-            function.getParameters().get(i).isOptional());
-    final List<RelDataType> paramTypes = toSql(typeFactory, argTypes);
+  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}. */
+  private static SqlOperator toOp(SqlIdentifier name,
+      final org.apache.calcite.schema.Function function) {
+    final Function<RelDataTypeFactory, List<RelDataType>> argTypesFactory =
+        typeFactory -> function.getParameters()
+            .stream()
+            .map(o -> o.getType(typeFactory))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<SqlTypeFamily>> typeFamiliesFactory =
+        typeFactory -> argTypesFactory.apply(typeFactory)
+            .stream()
+            .map(type ->
+                Util.first(type.getSqlTypeName().getFamily(),
+                    SqlTypeFamily.ANY))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<RelDataType>> paramTypesFactory =
+        typeFactory ->
+            argTypesFactory.apply(typeFactory)
+                .stream()
+                .map(type -> toSql(typeFactory, type))
+                .collect(Util.toImmutableList());
+
+    // Use a short-lived type factory to populate "typeFamilies" and "argTypes".
+    // SqlOperandMetadata.paramTypes will use the real type factory, during
+    // validation.
+    final RelDataTypeFactory dummyTypeFactory = new JavaTypeFactoryImpl();
+    final List<RelDataType> argTypes = argTypesFactory.apply(dummyTypeFactory);
+    final List<SqlTypeFamily> typeFamilies =
+        typeFamiliesFactory.apply(dummyTypeFactory);
+
+    final SqlOperandTypeInference operandTypeInference =
+        InferTypes.explicit(argTypes);
+
+    final SqlOperandMetadata operandMetadata =
+        OperandTypes.operandMetadata(typeFamilies, paramTypesFactory,
+            i -> function.getParameters().get(i).getName(),
+            i -> function.getParameters().get(i).isOptional());
+
+    final SqlKind kind = kind(function);
     if (function instanceof ScalarFunction) {
-      return new SqlUserDefinedFunction(name, infer((ScalarFunction) function),
-          InferTypes.explicit(argTypes), typeChecker, paramTypes, function);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((ScalarFunction) function);
+      return new SqlUserDefinedFunction(name, kind, returnTypeInference,
+          operandTypeInference, operandMetadata, function);
     } else if (function instanceof AggregateFunction) {
-      return new SqlUserDefinedAggFunction(name,
-          infer((AggregateFunction) function), InferTypes.explicit(argTypes),
-          typeChecker, (AggregateFunction) function, false, false,
-          Optionality.FORBIDDEN, typeFactory);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((AggregateFunction) function);
+      return new SqlUserDefinedAggFunction(name, kind,
+          returnTypeInference, operandTypeInference,
+          operandMetadata, (AggregateFunction) function, false, false,
+          Optionality.FORBIDDEN);
     } else if (function instanceof TableMacro) {
-      return new SqlUserDefinedTableMacro(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableMacro) function);
+      return new SqlUserDefinedTableMacro(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableMacro) function);
     } else if (function instanceof TableFunction) {
-      return new SqlUserDefinedTableFunction(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableFunction) function);
+      return new SqlUserDefinedTableFunction(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableFunction) function);
     } else {
       throw new AssertionError("unknown function type " + function);
     }
   }
 
+  /** Deduces the {@link org.apache.calcite.sql.SqlKind} of a user-defined
+   * function based on a {@link Hints} annotation, if present. */
+  private static SqlKind kind(org.apache.calcite.schema.Function function) {
+    if (function instanceof ScalarFunctionImpl) {
+      Hints hints =
+          ((ScalarFunctionImpl) function).method.getAnnotation(Hints.class);
+      if (hints != null) {
+        for (String hint : hints.value()) {
+          if (hint.startsWith("SqlKind:")) {
+            return SqlKind.valueOf(hint.substring("SqlKind:".length()));

Review comment:
       I'm not sure I see how the coupling would be increased. I'm really just talking about having a type for the hint.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r475101063



##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       My mistake. I didn't realize this was checking in the constraint. My point was that eventually we would ideally want to read metadata from the database to populate those constraints automatically.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r473361767



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -385,7 +410,18 @@ private static RelDataType toSql(RelDataTypeFactory typeFactory,
   }
 
   public List<SqlOperator> getOperatorList() {
-    return null;
+    final ImmutableList.Builder<SqlOperator> b = ImmutableList.builder();

Review comment:
       Can this be called `builder` instead of `b`?

##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -285,66 +286,95 @@ public static SqlOperatorTable operatorTable(String... classNames) {
           className, "*", true);
     }
 
-    // The following is technical debt; see [CALCITE-2082] Remove
-    // RelDataTypeFactory argument from SqlUserDefinedAggFunction constructor
-    final SqlTypeFactoryImpl typeFactory =
-        new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
-
     final ListSqlOperatorTable table = new ListSqlOperatorTable();
     for (String name : schema.getFunctionNames()) {
-      for (Function function : schema.getFunctions(name, true)) {
+      schema.getFunctions(name, true).forEach(function -> {
         final SqlIdentifier id = new SqlIdentifier(name, SqlParserPos.ZERO);
-        table.add(
-            toOp(typeFactory, id, function));
-      }
+        table.add(toOp(id, function));
+      });
     }
     return table;
   }
 
-  private SqlOperator toOp(SqlIdentifier name, final Function function) {
-    return toOp(typeFactory, name, function);
-  }
-
-  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}.
-   *
-   * <p>The {@code typeFactory} argument is technical debt; see [CALCITE-2082]
-   * Remove RelDataTypeFactory argument from SqlUserDefinedAggFunction
-   * constructor. */
-  private static SqlOperator toOp(RelDataTypeFactory typeFactory,
-      SqlIdentifier name, final Function function) {
-    List<RelDataType> argTypes = new ArrayList<>();
-    List<SqlTypeFamily> typeFamilies = new ArrayList<>();
-    for (FunctionParameter o : function.getParameters()) {
-      final RelDataType type = o.getType(typeFactory);
-      argTypes.add(type);
-      typeFamilies.add(
-          Util.first(type.getSqlTypeName().getFamily(), SqlTypeFamily.ANY));
-    }
-    final FamilyOperandTypeChecker typeChecker =
-        OperandTypes.family(typeFamilies, i ->
-            function.getParameters().get(i).isOptional());
-    final List<RelDataType> paramTypes = toSql(typeFactory, argTypes);
+  /** Converts a function to a {@link org.apache.calcite.sql.SqlOperator}. */
+  private static SqlOperator toOp(SqlIdentifier name,
+      final org.apache.calcite.schema.Function function) {
+    final Function<RelDataTypeFactory, List<RelDataType>> argTypesFactory =
+        typeFactory -> function.getParameters()
+            .stream()
+            .map(o -> o.getType(typeFactory))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<SqlTypeFamily>> typeFamiliesFactory =
+        typeFactory -> argTypesFactory.apply(typeFactory)
+            .stream()
+            .map(type ->
+                Util.first(type.getSqlTypeName().getFamily(),
+                    SqlTypeFamily.ANY))
+            .collect(Util.toImmutableList());
+    final Function<RelDataTypeFactory, List<RelDataType>> paramTypesFactory =
+        typeFactory ->
+            argTypesFactory.apply(typeFactory)
+                .stream()
+                .map(type -> toSql(typeFactory, type))
+                .collect(Util.toImmutableList());
+
+    // Use a short-lived type factory to populate "typeFamilies" and "argTypes".
+    // SqlOperandMetadata.paramTypes will use the real type factory, during
+    // validation.
+    final RelDataTypeFactory dummyTypeFactory = new JavaTypeFactoryImpl();
+    final List<RelDataType> argTypes = argTypesFactory.apply(dummyTypeFactory);
+    final List<SqlTypeFamily> typeFamilies =
+        typeFamiliesFactory.apply(dummyTypeFactory);
+
+    final SqlOperandTypeInference operandTypeInference =
+        InferTypes.explicit(argTypes);
+
+    final SqlOperandMetadata operandMetadata =
+        OperandTypes.operandMetadata(typeFamilies, paramTypesFactory,
+            i -> function.getParameters().get(i).getName(),
+            i -> function.getParameters().get(i).isOptional());
+
+    final SqlKind kind = kind(function);
     if (function instanceof ScalarFunction) {
-      return new SqlUserDefinedFunction(name, infer((ScalarFunction) function),
-          InferTypes.explicit(argTypes), typeChecker, paramTypes, function);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((ScalarFunction) function);
+      return new SqlUserDefinedFunction(name, kind, returnTypeInference,
+          operandTypeInference, operandMetadata, function);
     } else if (function instanceof AggregateFunction) {
-      return new SqlUserDefinedAggFunction(name,
-          infer((AggregateFunction) function), InferTypes.explicit(argTypes),
-          typeChecker, (AggregateFunction) function, false, false,
-          Optionality.FORBIDDEN, typeFactory);
+      final SqlReturnTypeInference returnTypeInference =
+          infer((AggregateFunction) function);
+      return new SqlUserDefinedAggFunction(name, kind,
+          returnTypeInference, operandTypeInference,
+          operandMetadata, (AggregateFunction) function, false, false,
+          Optionality.FORBIDDEN);
     } else if (function instanceof TableMacro) {
-      return new SqlUserDefinedTableMacro(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableMacro) function);
+      return new SqlUserDefinedTableMacro(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableMacro) function);
     } else if (function instanceof TableFunction) {
-      return new SqlUserDefinedTableFunction(name, ReturnTypes.CURSOR,
-          InferTypes.explicit(argTypes), typeChecker, paramTypes,
-          (TableFunction) function);
+      return new SqlUserDefinedTableFunction(name, kind, ReturnTypes.CURSOR,
+          operandTypeInference, operandMetadata, (TableFunction) function);
     } else {
       throw new AssertionError("unknown function type " + function);
     }
   }
 
+  /** Deduces the {@link org.apache.calcite.sql.SqlKind} of a user-defined
+   * function based on a {@link Hints} annotation, if present. */
+  private static SqlKind kind(org.apache.calcite.schema.Function function) {
+    if (function instanceof ScalarFunctionImpl) {
+      Hints hints =
+          ((ScalarFunctionImpl) function).method.getAnnotation(Hints.class);
+      if (hints != null) {
+        for (String hint : hints.value()) {
+          if (hint.startsWith("SqlKind:")) {
+            return SqlKind.valueOf(hint.substring("SqlKind:".length()));

Review comment:
       Having hints just be plain strings seems like it could up creating a bit of a mess if they start getting used more. As a simple extension, what if a hint was just a pair with the first part being an enum defined in the hints class specifying the type?

##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);

Review comment:
       In the future, it would be nice if this equality condition was not necessary, but I understand this is likely to require DB-specific implementation so it's a reasonable starting point.

##########
File path: core/src/main/java/org/apache/calcite/rel/rules/SpatialRules.java
##########
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rel.rules;
+
+import org.apache.calcite.plan.RelOptPredicateList;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptUtil;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexUtil;
+import org.apache.calcite.runtime.GeoFunctions;
+import org.apache.calcite.runtime.Geometries;
+import org.apache.calcite.runtime.HilbertCurve2D;
+import org.apache.calcite.runtime.SpaceFillingCurve2D;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.tools.RelBuilder;
+
+import com.esri.core.geometry.Envelope;
+import com.esri.core.geometry.Point;
+import com.google.common.collect.ImmutableList;
+
+import java.math.BigDecimal;
+import java.util.ArrayList;
+import java.util.EnumSet;
+import java.util.List;
+
+/**
+ * Collection of planner rules that convert
+ * calls to spatial functions into more efficient expressions.
+ *
+ * <p>The rules allow Calcite to use spatial indexes. For example the following
+ * query:
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>is rewritten to
+ *
+ * <blockquote>SELECT ...
+ * FROM Restaurants AS r
+ * WHERE (r.h BETWEEN 100 AND 150
+ *        OR r.h BETWEEN 170 AND 185)
+ * AND ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)
+ * </blockquote>
+ *
+ * <p>if there is the constraint
+ *
+ * <blockquote>CHECK (h = Hilbert(8, r.longitude, r.latitude))</blockquote>
+ *
+ * <p>If the {@code Restaurants} table is sorted on {@code h} then the latter
+ * query can be answered using two limited range-scans, and so is much more
+ * efficient.
+ *
+ * <p>Note that the original predicate
+ * {@code ST_DWithin(ST_Point(10, 20), ST_Point(r.longitude, r.latitude), 5)}
+ * is still present, but is evaluated after the approximate predicate has
+ * eliminated many potential matches.
+ */
+public abstract class SpatialRules {
+
+  private SpatialRules() {}
+
+  private static final RexUtil.RexFinder DWITHIN_FINDER =
+      RexUtil.find(EnumSet.of(SqlKind.ST_DWITHIN, SqlKind.ST_CONTAINS));
+
+  private static final RexUtil.RexFinder HILBERT_FINDER =
+      RexUtil.find(SqlKind.HILBERT);
+
+  public static final RelOptRule INSTANCE =
+      FilterHilbertRule.Config.DEFAULT.toRule();
+
+  /** Returns a geometry if an expression is constant, null otherwise. */
+  private static Geometries.Geom constantGeom(RexNode e) {
+    switch (e.getKind()) {
+    case CAST:
+      return constantGeom(((RexCall) e).getOperands().get(0));
+    case LITERAL:
+      return (Geometries.Geom) ((RexLiteral) e).getValue();
+    default:
+      return null;
+    }
+  }
+
+  /** Rule that converts ST_DWithin in a Filter condition into a predicate on
+   * a Hilbert curve. */
+  @SuppressWarnings("WeakerAccess")
+  public static class FilterHilbertRule
+      extends RelRule<FilterHilbertRule.Config> {
+    protected FilterHilbertRule(Config config) {
+      super(config);
+    }
+
+    @Override public void onMatch(RelOptRuleCall call) {
+      final Filter filter = call.rel(0);
+      final List<RexNode> conjunctions = new ArrayList<>();
+      RelOptUtil.decomposeConjunction(filter.getCondition(), conjunctions);
+
+      // Match a predicate
+      //   r.hilbert = hilbert(r.longitude, r.latitude)
+      // to one of the conjunctions
+      //   ST_DWithin(ST_Point(x, y), ST_Point(r.longitude, r.latitude), d)
+      // and if it matches add a new conjunction before it,
+      //   r.hilbert between h1 and h2
+      //   or r.hilbert between h3 and h4
+      // where {[h1, h2], [h3, h4]} are the ranges of the Hilbert curve
+      // intersecting the square
+      //   (r.longitude - d, r.latitude - d, r.longitude + d, r.latitude + d)
+      final RelOptPredicateList predicates =
+          call.getMetadataQuery().getAllPredicates(filter.getInput());
+      int changeCount = 0;
+      for (RexNode predicate : predicates.pulledUpPredicates) {
+        final RelBuilder builder = call.builder();
+        if (predicate.getKind() == SqlKind.EQUALS) {
+          final RexCall eqCall = (RexCall) predicate;
+          if (eqCall.operands.get(0) instanceof RexInputRef
+              && eqCall.operands.get(1).getKind() == SqlKind.HILBERT) {
+            final RexInputRef ref  = (RexInputRef) eqCall.operands.get(0);
+            final RexCall hilbert = (RexCall) eqCall.operands.get(1);
+            final RexUtil.RexFinder finder = RexUtil.find(ref);
+            if (finder.anyContain(conjunctions)) {
+              // If the condition already contains "ref", it is probable that
+              // this rule has already fired once.
+              continue;
+            }
+            for (int i = 0; i < conjunctions.size();) {
+              final List<RexNode> replacements =
+                  replaceSpatial(conjunctions.get(i), builder, ref, hilbert);
+              if (replacements != null) {
+                conjunctions.remove(i);
+                conjunctions.addAll(i, replacements);
+                i += replacements.size();
+                ++changeCount;
+              } else {
+                ++i;
+              }
+            }
+          }
+        }
+        if (changeCount > 0) {
+          call.transformTo(
+              builder.push(filter.getInput())
+                  .filter(conjunctions)
+                  .build());
+          return; // we found one useful constraint; don't look for more

Review comment:
       I'm sure there's a good reason, but it's not immediately obvious to me why you wouldn't want to keep looking.

##########
File path: core/src/main/java/org/apache/calcite/rel/RelReferentialConstraint.java
##########
@@ -27,10 +27,13 @@
 public interface RelReferentialConstraint {
   //~ Methods ----------------------------------------------------------------
 
-  /**
-   * Returns the number of columns in the keys.
-   */
-  int getNumColumns();
+  /** Returns the number of columns in the keys.
+   *
+   * @deprecated Use {@code getColumnPairs().size()} */
+  @Deprecated // to be removed before 2.0
+  default int getNumColumns() {
+    return getColumnPairs().size();
+  }

Review comment:
       This deprecation isn't really part of this PR and putting a deprecation in here probably means it won't end up in the release notes. I don't imagine this is widely used though, so maybe it doesn't really matter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474183958



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -385,7 +410,18 @@ private static RelDataType toSql(RelDataTypeFactory typeFactory,
   }
 
   public List<SqlOperator> getOperatorList() {
-    return null;
+    final ImmutableList.Builder<SqlOperator> b = ImmutableList.builder();

Review comment:
       Done. I thought `b` would be OK because its uses are all near to its declaration.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474186421



##########
File path: core/src/main/java/org/apache/calcite/rel/RelReferentialConstraint.java
##########
@@ -27,10 +27,13 @@
 public interface RelReferentialConstraint {
   //~ Methods ----------------------------------------------------------------
 
-  /**
-   * Returns the number of columns in the keys.
-   */
-  int getNumColumns();
+  /** Returns the number of columns in the keys.
+   *
+   * @deprecated Use {@code getColumnPairs().size()} */
+  @Deprecated // to be removed before 2.0
+  default int getNumColumns() {
+    return getColumnPairs().size();
+  }

Review comment:
       We need to put breaking changes in the release notes but I don't regard deprecations as breaking changes.
   
   I've moved this change (and related changes in `MaterializedViewRule`) into a 'refactor' commit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] michaelmior commented on a change in pull request #2111: [CALCITE-1861] Spatial index, based on Hilbert space-filling curve

Posted by GitBox <gi...@apache.org>.
michaelmior commented on a change in pull request #2111:
URL: https://github.com/apache/calcite/pull/2111#discussion_r474193124



##########
File path: core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java
##########
@@ -385,7 +410,18 @@ private static RelDataType toSql(RelDataTypeFactory typeFactory,
   }
 
   public List<SqlOperator> getOperatorList() {
-    return null;
+    final ImmutableList.Builder<SqlOperator> b = ImmutableList.builder();

Review comment:
       I assumed as much, but IMO it's used few enough times that it's worth being explicit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org