You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2022/03/03 04:47:10 UTC

[GitHub] [drill] paul-rogers commented on a change in pull request #2461: DRILL-8136: Overload existing Math UDFs to allow for VARCHAR input

paul-rogers commented on a change in pull request #2461:
URL: https://github.com/apache/drill/pull/2461#discussion_r818306299



##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MathFunctions.java
##########
@@ -33,22 +33,40 @@
 import org.apache.drill.exec.expr.holders.VarCharHolder;
 
 public class MathFunctions {
-
+/*
   @FunctionTemplate(name = "power", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL)
   public static class Power implements DrillSimpleFunc{
 
-    @Param Float8Holder a;
-    @Param Float8Holder b;
+    @Param VarCharHolder a;
+    @Param VarCharHolder b;
     @Output  Float8Holder out;
 
-    public void setup(){}
+    @Workspace org.apache.drill.exec.expr.fn.impl.MathFunctionsVarcharUtils mathUtils;
 
-    public void eval(){
-      out.value = java.lang.Math.pow(a.value, b.value);
+    public void setup(){
+      mathUtils = new MathFunctionsVarcharUtils();

Review comment:
       Would recommend using static functions rather an instance and member variables.

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MathFunctions.java
##########
@@ -33,22 +33,40 @@
 import org.apache.drill.exec.expr.holders.VarCharHolder;
 
 public class MathFunctions {
-
+/*

Review comment:
       Why is this commented out and why does it replace the existing Float8 implementation?

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MathFunctionsVarcharInput.java
##########
@@ -0,0 +1,569 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.expr.fn.impl;
+
+
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+public class MathFunctionsVarcharInput {
+  /*

Review comment:
       Why are we adding new code which is commented out?

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MathFunctionsVarcharInput.java
##########
@@ -0,0 +1,569 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.expr.fn.impl;
+
+
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+public class MathFunctionsVarcharInput {
+  /*
+  @FunctionTemplate(name = "rand", isRandom = true,
+    scope = FunctionTemplate.FunctionScope.SIMPLE, nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class RandomWithSeed implements DrillSimpleFunc{
+    @Param
+    VarCharHolder seed;
+    @Workspace
+    java.util.Random rand;
+    @Output  Float8Holder out;
+
+    @Workspace org.apache.drill.exec.expr.fn.impl.MathFunctionsVarcharUtils mathUtils;
+
+    public void setup(){
+      mathUtils = new MathFunctionsVarcharUtils();
+    }
+
+    public void eval(){
+      String seedStr = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(seed);
+      Double seedDbl = mathUtils.validateInput(seedStr);
+
+      if (seedDbl.isNaN()) {
+        out.value = java.lang.Float.NaN;
+      }
+      else {
+        BigInteger seedBigInt = new java.math.BigInteger(String.valueOf(seedDbl));
+        rand = new java.util.Random(seedBigInt.intValue());
+        out.value = rand.nextDouble();
+      }
+    }
+  }
+  /*
+    // The preexisting power() method accepts VARCHAR input, but this method adds a check for invalid input
+    @FunctionTemplate(name = "power", scope = FunctionTemplate.FunctionScope.SIMPLE, nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+    public static class Power implements DrillSimpleFunc{
+
+      @Param VarCharHolder a;
+      @Param VarCharHolder b;
+      @Output  Float8Holder out;
+
+      @Workspace org.apache.drill.exec.expr.fn.impl.MathFunctionsVarcharUtils mathUtils;
+
+      public void setup(){
+        mathUtils = new MathFunctionsVarcharUtils();
+      }
+
+      public void eval(){
+        String aStr = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(a);
+        Double aDbl = mathUtils.validateInput(aStr);
+
+        String bStr = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(b);
+        Double bDbl = mathUtils.validateInput(bStr);
+
+        if (aDbl.isNaN() || bDbl.isNaN()) {
+          out.value = java.lang.Float.NaN;
+        }
+        else {
+          out.value = java.lang.Math.pow(aDbl, bDbl);
+        }
+
+        System.out.println("out.value: " + out.value);
+      }
+
+    }
+
+    // The preexisting power() method accepts VARCHAR input, but this method adds a check for invalid input
+    @FunctionTemplate(name = "power", scope = FunctionTemplate.FunctionScope.SIMPLE, nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+    public static class Power implements DrillSimpleFunc{
+
+      @Param VarCharHolder a;
+      @Param Float8Holder b;
+      @Output  Float8Holder out;
+
+      @Workspace org.apache.drill.exec.expr.fn.impl.MathFunctionsVarcharUtils mathUtils;
+
+      public void setup(){
+        mathUtils = new MathFunctionsVarcharUtils();
+      }
+
+      public void eval(){
+        String aStr = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(a);
+        Double aDbl = mathUtils.validateInput(aStr);
+
+        if (aDbl.isNaN()) {
+          out.value = java.lang.Float.NaN;
+        }
+        else {
+          out.value = java.lang.Math.pow(aDbl, b);
+        }
+
+        System.out.println("out.value: " + out.value);
+      }
+
+    }
+
+   */
+  @FunctionTemplate(name = "mod",
+    scope = FunctionTemplate.FunctionScope.SIMPLE,
+    nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class Mod implements DrillSimpleFunc {
+
+    @Param
+    VarCharHolder a;

Review comment:
       These functions are a bit tricky: I wonder if there is a bit of misunderstanding. Drill uses a typed system at execution time. Each math function has declared input types: BIGINT, DOUBLE, VARDECIMAL, etc. The type promotion system will pick the "narrowest" type that is compatible with both arguments, then will cast both arguments, as needed to widen them to the target type. So MOD(BIGINT, BIGINT) is compatible with MOD(B, I) where B is a BIGINT, and I is INT. I is then cast to BIGINT.
   
   As a result, we generate MANY overrides of the same function: one for each of the numeric "families": BIGINT, DOUBLE, VARDECIMAL, with sometimes others thrown in as well, such as maybe an INT version.
   
   So, let's consider this improvement. We want a MOD(VARCHAR, X). What is X? In this bit of code, X is VARCHAR, which matches MOD("10.3", "20"). Cool. But, what about MOD("10.3", 20)? I don't think Drill automatically promotes from numeric types to VARCHAR. (Didn't test, but few SQL systems do.) So, you'd also want MOD(VARCHAR, VARDECIMAL), so the second argument can be cast from the integer and floating point types. To heavy-weight, then add MOD(VARCHAR, BIGINT) and MOD(VARCHAR, DOUBLE). Now the bases are covered.
   
   Except, not yet. The above only handles the second argument. What about the first? So, we also need MOD(BIGINT, VARCHAR), MOD(DOUBLE, VARCHAR), MOD(VARDECIMAL, VARCHAR).
   
   That's a total of 7 implementations for each two-argument math function!
   
   If we did that, I'd almost suggest we break SQL type casting rules and define automatic conversions from VARCHAR to any numeric type. We pick the type based on the auto-cast of VARCHAR (to DOUBLE, say) and any numeric argument. In this case, you get to reuse all the existing math functions for free: no need to change the casting system the hard way.
   
   The result would no longer be SQL compatible. So, maybe a system/session option to enable "Python mode" when needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org