You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2020/11/18 22:17:51 UTC

[GitHub] [calcite] jamesstarr opened a new pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

jamesstarr opened a new pull request #2271:
URL: https://github.com/apache/calcite/pull/2271


   …opereator(James Starr)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] jamesstarr commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
jamesstarr commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-738236166


   If this looks good, can I get this merged?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730682073


   If Oracle's `SUBSTR` has very different semantics, treat it as a different function. Consider calling it `ORACLE_SUBSTR` or something so that no one gets confused. Adding a method to `SqlFunctions` may make things a lot more concise.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] jamesstarr commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
jamesstarr commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730704266


   @julianhyde, I renamed SqlLibraryOperators.SUBSTR to SqlLibraryOperators.ORACLE_SUBSTR and I have updated the comment.
   
   If I understand you other comments, you are saying the implementation would be straight forward to add a function SqlFunctions and then bind it in BuiltInMethod, so it works for the unit tests.  And the benefit of doing this would be that it is more concise and if you translated it back sql, the native function would still used.  The disadvantages are that non-oracle targets would have to implement the function themselves and simplification/optimization could not occur since it is not part of the calcites canonical form. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#discussion_r535566439



##########
File path: core/src/main/java/org/apache/calcite/sql2rel/StandardConvertletTable.java
##########
@@ -1506,6 +1508,83 @@ private SqlNode getCastedSqlNode(SqlNode argInput, RelDataType varType,
     }
   }
 
+  /** Convertlet that handles Oracle's {@code SUBSTR} function. */
+  private static class OracleSubstrConvertlet implements SqlRexConvertlet {
+    @Override public RexNode convertCall(SqlRexContext cx, SqlCall call) {
+      // Translate
+      //   SUBSTR(value, start, length)
+      // to
+      //   CASE
+      //   WHEN length < 0 THEN ""
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1, length)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1, length)
+      //   ELSE substring(value, start, length)
+      //   END
+      //
+      // Translate
+      //   SUBSTR(value, start)
+      // to
+      //   CASE
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1)
+      //   ELSE substring(value, start)
+      //   END
+
+      final RexBuilder rexBuilder = cx.getRexBuilder();
+      final List<RexNode> exprs =
+          convertOperands(cx, call, SqlOperandTypeChecker.Consistency.NONE);
+      final RexNode value = exprs.get(0);
+      final RexNode start = exprs.get(1);
+      final RexNode zeroLiteral = rexBuilder.makeLiteral(0, start.getType(), false);
+      final RexNode oneLiteral = rexBuilder.makeLiteral(1, start.getType(), false);
+      final RexNode emptyStrLiteral = rexBuilder.makeLiteral("", value.getType(), false);
+
+      final RexNode valueLength =
+          rexBuilder.makeCall(SqlStdOperatorTable.CHAR_LENGTH, value);
+      final RexNode negativeIndexStart = rexBuilder.makeCall(SqlStdOperatorTable.PLUS,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, valueLength, start),
+          oneLiteral);
+
+      final RexNode startIndexOutOfBound = rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, start, valueLength),
+          zeroLiteral);
+      final RexNode startLessThanZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, start, zeroLiteral);
+      final RexNode startIsZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.EQUALS, start, zeroLiteral);
+
+      final Function<RexNode, RexNode> startToSubstr;
+      switch (call.operandCount()) {
+      case 2:
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex);
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      case 3:
+        final RexNode length = exprs.get(2);
+        final RexNode lengthLessThanZero =
+            rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, length, zeroLiteral);
+
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex, length);
+
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            lengthLessThanZero, emptyStrLiteral,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      default:
+        throw new AssertionError();

Review comment:
       I don't think that's necessary.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-738985305


   @jamesstarr The purpose of this bug (as I see it, and as evidenced by the test cases in this PR) is to make sure that the Oracle `SUBSTR` function returns the correct results when run in Calcite's native mode.
   
   I acknowledge that I ignored most of your most recent commit. I don't think that it improved things. Your change was based based on the idea that Calcite's standard `SUBSTRING` operator would have different semantics depending on the SQL engine that it is implemented on. That is a deeply flawed assumption.
   
   Let's log a follow-up bug (or bugs) to make sure that `SUBSTRING`, and Oracle `SUBSTR`, function correctly when we generate SQL against other engines, and make sure that the fixes include test cases.
   
   I have already logged https://issues.apache.org/jira/browse/CALCITE-4427 to restore standards-compliant behavior to the SUBSTRING operator and I am working on it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#discussion_r526469745



##########
File path: core/src/test/java/org/apache/calcite/sql/test/SqlOperatorBaseTest.java
##########
@@ -2186,6 +2186,13 @@ protected static Calendar getCalendarNotTooNear(int timeUnit) {
     checkConcat2Func(tester(SqlLibrary.ORACLE));
   }
 
+  @Test void testSubString() {

Review comment:
       It looks as if this function was never tested. Thanks for fixing the hole.
   
   Rename the test method to `testOracleSubstrFunction` and move it next to `testSubstringFunction`.
   
   Test a few more cases - e.g. indexes out of bounds, and null values, and CHAR argument.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730680731


   Let's just suppose for a minute that Oracle's SUBSTR function had the same semantics as standard SUBSTRING, just different syntax. Then it is not a goal to keep it intact through Calcite's planning process. Calcite's philosophy is to convert to a minimal language. That creates opportunities for simplification/optimization.
   
   If Oracle is the target, then we should translate SUBSTRING back to SUBSTR when we generate Oracle SQL. But it is not necessarily the target.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
vlsi commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730653973


   > I.e. treat Oracle's function as syntactic sugar
   
   That would not work if downstream Oracle DB has a functional-based index with `SUBSTRING`.
   If Calcite transforms `SUBSTRING` into `SUBSTR`, then the query won't use the index.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730682951


   Also, correct the javadoc "It has similar semantics to standard SQL's {@link SqlStdOperatorTable#SUBSTRING} function but different syntax". That is a wrong assumption.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] jamesstarr commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
jamesstarr commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-738933784


   @julianhyde @vlsi Why was commit that allowed the substr to be correctly executed on different sql engines not included?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] jamesstarr commented on a change in pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
jamesstarr commented on a change in pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#discussion_r535574727



##########
File path: core/src/main/java/org/apache/calcite/sql2rel/StandardConvertletTable.java
##########
@@ -1506,6 +1508,83 @@ private SqlNode getCastedSqlNode(SqlNode argInput, RelDataType varType,
     }
   }
 
+  /** Convertlet that handles Oracle's {@code SUBSTR} function. */
+  private static class OracleSubstrConvertlet implements SqlRexConvertlet {
+    @Override public RexNode convertCall(SqlRexContext cx, SqlCall call) {
+      // Translate
+      //   SUBSTR(value, start, length)
+      // to
+      //   CASE
+      //   WHEN length < 0 THEN ""
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1, length)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1, length)
+      //   ELSE substring(value, start, length)
+      //   END
+      //
+      // Translate
+      //   SUBSTR(value, start)
+      // to
+      //   CASE
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1)
+      //   ELSE substring(value, start)
+      //   END
+
+      final RexBuilder rexBuilder = cx.getRexBuilder();
+      final List<RexNode> exprs =
+          convertOperands(cx, call, SqlOperandTypeChecker.Consistency.NONE);
+      final RexNode value = exprs.get(0);
+      final RexNode start = exprs.get(1);
+      final RexNode zeroLiteral = rexBuilder.makeLiteral(0, start.getType(), false);
+      final RexNode oneLiteral = rexBuilder.makeLiteral(1, start.getType(), false);
+      final RexNode emptyStrLiteral = rexBuilder.makeLiteral("", value.getType(), false);
+
+      final RexNode valueLength =
+          rexBuilder.makeCall(SqlStdOperatorTable.CHAR_LENGTH, value);
+      final RexNode negativeIndexStart = rexBuilder.makeCall(SqlStdOperatorTable.PLUS,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, valueLength, start),
+          oneLiteral);
+
+      final RexNode startIndexOutOfBound = rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, start, valueLength),
+          zeroLiteral);
+      final RexNode startLessThanZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, start, zeroLiteral);
+      final RexNode startIsZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.EQUALS, start, zeroLiteral);
+
+      final Function<RexNode, RexNode> startToSubstr;
+      switch (call.operandCount()) {
+      case 2:
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex);
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      case 3:
+        final RexNode length = exprs.get(2);
+        final RexNode lengthLessThanZero =
+            rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, length, zeroLiteral);
+
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex, length);
+
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            lengthLessThanZero, emptyStrLiteral,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      default:
+        throw new AssertionError();

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] asfgit closed pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #2271:
URL: https://github.com/apache/calcite/pull/2271


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#discussion_r535559413



##########
File path: core/src/main/java/org/apache/calcite/sql2rel/StandardConvertletTable.java
##########
@@ -1506,6 +1508,83 @@ private SqlNode getCastedSqlNode(SqlNode argInput, RelDataType varType,
     }
   }
 
+  /** Convertlet that handles Oracle's {@code SUBSTR} function. */
+  private static class OracleSubstrConvertlet implements SqlRexConvertlet {
+    @Override public RexNode convertCall(SqlRexContext cx, SqlCall call) {
+      // Translate
+      //   SUBSTR(value, start, length)
+      // to
+      //   CASE
+      //   WHEN length < 0 THEN ""
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1, length)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1, length)
+      //   ELSE substring(value, start, length)
+      //   END
+      //
+      // Translate
+      //   SUBSTR(value, start)
+      // to
+      //   CASE
+      //   WHEN start + length(value) < 0 THEN ""
+      //   WHEN start = 0 THEN substring(value, 1)
+      //   WHEN start < 0 THEN substring(value, start + length(value) + 1)
+      //   ELSE substring(value, start)
+      //   END
+
+      final RexBuilder rexBuilder = cx.getRexBuilder();
+      final List<RexNode> exprs =
+          convertOperands(cx, call, SqlOperandTypeChecker.Consistency.NONE);
+      final RexNode value = exprs.get(0);
+      final RexNode start = exprs.get(1);
+      final RexNode zeroLiteral = rexBuilder.makeLiteral(0, start.getType(), false);
+      final RexNode oneLiteral = rexBuilder.makeLiteral(1, start.getType(), false);
+      final RexNode emptyStrLiteral = rexBuilder.makeLiteral("", value.getType(), false);
+
+      final RexNode valueLength =
+          rexBuilder.makeCall(SqlStdOperatorTable.CHAR_LENGTH, value);
+      final RexNode negativeIndexStart = rexBuilder.makeCall(SqlStdOperatorTable.PLUS,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, valueLength, start),
+          oneLiteral);
+
+      final RexNode startIndexOutOfBound = rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN,
+          rexBuilder.makeCall(SqlStdOperatorTable.PLUS, start, valueLength),
+          zeroLiteral);
+      final RexNode startLessThanZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, start, zeroLiteral);
+      final RexNode startIsZero =
+          rexBuilder.makeCall(SqlStdOperatorTable.EQUALS, start, zeroLiteral);
+
+      final Function<RexNode, RexNode> startToSubstr;
+      switch (call.operandCount()) {
+      case 2:
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex);
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      case 3:
+        final RexNode length = exprs.get(2);
+        final RexNode lengthLessThanZero =
+            rexBuilder.makeCall(SqlStdOperatorTable.LESS_THAN, length, zeroLiteral);
+
+        startToSubstr = startRex ->
+            rexBuilder.makeCall(SqlStdOperatorTable.SUBSTRING, value, startRex, length);
+
+        return rexBuilder.makeCall(SqlStdOperatorTable.CASE,
+            lengthLessThanZero, emptyStrLiteral,
+            startIndexOutOfBound, emptyStrLiteral,
+            startIsZero, startToSubstr.apply(oneLiteral),
+            startLessThanZero, startToSubstr.apply(negativeIndexStart),
+            startToSubstr.apply(start));
+      default:
+        throw new AssertionError();

Review comment:
       Please add the relevant message to clarify why this branch must not be reachable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
julianhyde commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-731458932


   @jamesstarr I've thought about this some more, and I now think your approach of expanding Oracle SUBSTR calls to standard SUBSTRING calls + CASE is better than adding a new method to SqlFunctions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] jamesstarr commented on pull request #2271: CALCITE-4408 Adding an operand type checker for the Oracle substring …

Posted by GitBox <gi...@apache.org>.
jamesstarr commented on pull request #2271:
URL: https://github.com/apache/calcite/pull/2271#issuecomment-730663893


   From what I have read, oracles substr has a different behavior then the standard sql substr.  Particularly, that is can return null if a 0 or a negative length is passed in, it translates the 0 start index to 1, and if the start index is negative then the start indexes from the end.
   
   @julianhyde, are you suggesting simply aliasing substr to substring?  Because that will not work due to the above reasons.   If you are suggesting using an api that would intercept the arguments at compile time, that would not work due to function arguments, as @vlsi has already stated.  Are you suggesting using a different API to implement the above logic, then could you please point me to class, so I can implement it in the correct place. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org