You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2021/05/19 13:12:21 UTC

[GitHub] [drill] cgivre commented on a change in pull request #2227: DRILL-7928: Add fourth parameter for split_part udf

cgivre commented on a change in pull request #2227:
URL: https://github.com/apache/drill/pull/2227#discussion_r635212683



##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
##########
@@ -434,6 +434,64 @@ public void eval() {
 
   }
 
+  @FunctionTemplate(name = "split_part", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL,

Review comment:
       Could you please add a javadoc describing the function and usage?  Copying from the other PR with the docs is fine.

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
##########
@@ -434,6 +434,64 @@ public void eval() {
 
   }
 
+  @FunctionTemplate(name = "split_part", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL,
+    outputWidthCalculatorType = OutputWidthCalculatorType.CUSTOM_FIXED_WIDTH_DEFAULT)
+  public static class SplitPartStartEnd implements DrillSimpleFunc {
+    @Param
+    VarCharHolder in;
+    @Param
+    VarCharHolder delimiter;
+    @Param
+    IntHolder start;
+    @Param
+    IntHolder end;
+
+    @Workspace
+    com.google.common.base.Splitter splitter;
+
+    @Workspace
+    com.google.common.base.Joiner joiner;
+
+    @Inject
+    DrillBuf buffer;
+
+    @Output
+    VarCharHolder out;
+
+    @Override
+    public void setup() {
+      if (start.value < 1) {
+        throw org.apache.drill.common.exceptions.UserException.functionError()
+          .message("Start index in split_part must be positive, value provided was " + start.value).build();
+      }
+      if (end.value < start.value) {
+        throw org.apache.drill.common.exceptions.UserException.functionError()
+          .message("End index in split_part must be greater or equal to start index, value provided was start:" + start.value + ",end:" + end.value).build();
+      }
+      String split = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.
+        toStringFromUTF8(delimiter.start, delimiter.end, delimiter.buffer);
+      splitter = com.google.common.base.Splitter.on(split);
+      joiner = com.google.common.base.Joiner.on(split);
+    }
+
+    @Override
+    public void eval() {
+      String inputString =
+        org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in.start, in.end, in.buffer);

Review comment:
       Nit:  Could you please break this line up a bit so that it fits in the github window?
   Also, there is a function called `getStringFromVarcharHolder` or something similar in that class. You might consider using that.

##########
File path: exec/java-exec/src/test/java/org/apache/drill/exec/expr/fn/impl/TestStringFunctions.java
##########
@@ -113,6 +113,75 @@ public void testSplitPart() throws Exception {
         .go();
   }
 
+  @Test
+  public void testSplitPartStartEnd() throws Exception {
+    testBuilder()
+      .sqlQuery("select split_part(a, '~@~', 1, 2) res1 from (values('abc~@~def~@~ghi'), ('qwe~@~rty~@~uio')) as t(a)")
+      .ordered()
+      .baselineColumns("res1")
+      .baselineValues("abc~@~def")
+      .baselineValues("qwe~@~rty")
+      .go();
+
+    testBuilder()
+      .sqlQuery("select split_part(a, '~@~', 2, 3) res1 from (values('abc~@~def~@~ghi'), ('qwe~@~rty~@~uio')) as t(a)")
+      .ordered()
+      .baselineColumns("res1")
+      .baselineValues("def~@~ghi")
+      .baselineValues("rty~@~uio")
+      .go();
+
+    // invalid index
+    boolean expectedErrorEncountered;

Review comment:
       Could you break some of these out into separate unit tests?  Particularly, the failing tests?
   Here's an example if you need some reference code:
   
   https://github.com/apache/drill/blob/9a9005aa5eb41e07c11a935521f8635f7b79b6c5/contrib/format-excel/src/test/java/org/apache/drill/exec/store/excel/TestExcelFormat.java#L204-L217

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
##########
@@ -434,6 +434,64 @@ public void eval() {
 
   }
 
+  @FunctionTemplate(name = "split_part", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL,
+    outputWidthCalculatorType = OutputWidthCalculatorType.CUSTOM_FIXED_WIDTH_DEFAULT)
+  public static class SplitPartStartEnd implements DrillSimpleFunc {
+    @Param
+    VarCharHolder in;
+    @Param
+    VarCharHolder delimiter;
+    @Param
+    IntHolder start;
+    @Param
+    IntHolder end;
+
+    @Workspace
+    com.google.common.base.Splitter splitter;
+
+    @Workspace
+    com.google.common.base.Joiner joiner;
+
+    @Inject
+    DrillBuf buffer;
+
+    @Output
+    VarCharHolder out;
+
+    @Override
+    public void setup() {
+      if (start.value < 1) {

Review comment:
       What happens if the `start` and `end` values are dynamic?  I'd think you'd want this logic to be in the `eval()` function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org