You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/10/11 06:46:03 UTC

[GitHub] [druid] clintropolis opened a new pull request, #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

clintropolis opened a new pull request, #13209:
URL: https://github.com/apache/druid/pull/13209

   ### Description
   Changes vectorized string expressions (such as concat) to deal in `Object[]` instead of `String[]`, to be consistent with the behavior of `VectorObjectSelector`, as well as with a similar change I did recently to array expressions in #12914.
   
   This fixes a class of bugs such as in the added test example, which would fail with an error of the form:
   ```
       Caused by: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.lang.String;
   	at org.apache.druid.math.expr.IdentifierExpr$4.evalVector(IdentifierExpr.java:198) ~[classes/:?]
   	at org.apache.druid.math.expr.vector.StringOutMultiStringInVectorProcessor.evalVector(StringOutMultiStringInVectorProcessor.java:58) ~[classes/:?]
   	at org.apache.druid.segment.virtual.ExpressionVectorObjectSelector.getObjectVector(ExpressionVectorObjectSelector.java:49) ~[classes/:?]
   ```
   
   before the changes in this patch.
   
   <hr>
   This PR has:
   
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a diff in pull request #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

Posted by GitBox <gi...@apache.org>.
clintropolis commented on code in PR #13209:
URL: https://github.com/apache/druid/pull/13209#discussion_r992901960


##########
core/src/main/java/org/apache/druid/math/expr/vector/VectorStringProcessors.java:
##########
@@ -32,32 +33,35 @@ public static <T> ExprVectorProcessor<T> concat(Expr.VectorInputBindingInspector
   {
     final ExprVectorProcessor processor;
     if (NullHandling.sqlCompatible()) {
-      processor = new StringOutStringsInFunctionVectorProcessor(
+      processor = new ObjectOutObjectsInFunctionVectorProcessor(
           left.buildVectorized(inspector),
           right.buildVectorized(inspector),
-          inspector.getMaxVectorSize()
+          inspector.getMaxVectorSize(),
+          ExpressionType.STRING
       )
       {
         @Nullable
         @Override
-        protected String processValue(@Nullable String leftVal, @Nullable String rightVal)
+        protected String processValue(@Nullable Object leftVal, @Nullable Object rightVal)
         {
           // in sql compatible mode, nulls are handled by super class and never make it here...
-          return leftVal + rightVal;
+          return leftVal + (String) rightVal;

Review Comment:
   same comment as https://github.com/apache/druid/pull/13209#discussion_r992901606 about why is safe



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cheddar commented on a diff in pull request #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

Posted by GitBox <gi...@apache.org>.
cheddar commented on code in PR #13209:
URL: https://github.com/apache/druid/pull/13209#discussion_r992870933


##########
processing/src/main/java/org/apache/druid/segment/virtual/SingleStringInputDeferredEvaluationExpressionDimensionVectorSelector.java:
##########
@@ -118,7 +119,7 @@ public int getCurrentVectorSize()
    */
   private static final class StringLookupVectorInputBindings implements Expr.VectorInputBinding
   {
-    private final String[] currentValue = new String[1];
+    private final Object[] currentValue = new Object[1];

Review Comment:
   Nitish: The naming of this private static class doesn't really line up with its implementation anymore?



##########
core/src/main/java/org/apache/druid/math/expr/vector/VectorProcessors.java:
##########
@@ -851,17 +861,19 @@ public void processIndex(
                 outputNulls[i] = true;
                 return;
               }
-              final boolean bool = Evals.asBoolean(rightInput[i]);
+              final boolean bool = Evals.asBoolean((String) rightInput[i]);
               output[i] = Evals.asLong(bool);
               outputNulls[i] = bool;
               return;
             } else if (rightNull) {
-              final boolean bool = Evals.asBoolean(leftInput[i]);
+              final boolean bool = Evals.asBoolean((String) leftInput[i]);

Review Comment:
   Here too, why is it safe to cast?  do we need to check the type first?  Or maybe check the type of the array that we are gotten, so we can amortize the cost of the check?



##########
core/src/main/java/org/apache/druid/math/expr/vector/VectorStringProcessors.java:
##########
@@ -32,32 +33,35 @@ public static <T> ExprVectorProcessor<T> concat(Expr.VectorInputBindingInspector
   {
     final ExprVectorProcessor processor;
     if (NullHandling.sqlCompatible()) {
-      processor = new StringOutStringsInFunctionVectorProcessor(
+      processor = new ObjectOutObjectsInFunctionVectorProcessor(
           left.buildVectorized(inspector),
           right.buildVectorized(inspector),
-          inspector.getMaxVectorSize()
+          inspector.getMaxVectorSize(),
+          ExpressionType.STRING
       )
       {
         @Nullable
         @Override
-        protected String processValue(@Nullable String leftVal, @Nullable String rightVal)
+        protected String processValue(@Nullable Object leftVal, @Nullable Object rightVal)
         {
           // in sql compatible mode, nulls are handled by super class and never make it here...
-          return leftVal + rightVal;
+          return leftVal + (String) rightVal;

Review Comment:
   You could call `.toString()` and avoid ClassCastExceptions here?  Or, if it's important to do the cast, we should really do an instanceof check first so that we can generate a better error message than just `ClassCastException`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a diff in pull request #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

Posted by GitBox <gi...@apache.org>.
clintropolis commented on code in PR #13209:
URL: https://github.com/apache/druid/pull/13209#discussion_r992909147


##########
processing/src/main/java/org/apache/druid/segment/virtual/SingleStringInputDeferredEvaluationExpressionDimensionVectorSelector.java:
##########
@@ -118,7 +119,7 @@ public int getCurrentVectorSize()
    */
   private static final class StringLookupVectorInputBindings implements Expr.VectorInputBinding
   {
-    private final String[] currentValue = new String[1];
+    private final Object[] currentValue = new Object[1];

Review Comment:
   I guess it is still used exclusively for 'deferred' single string expressions, so maybe is ok?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a diff in pull request #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

Posted by GitBox <gi...@apache.org>.
clintropolis commented on code in PR #13209:
URL: https://github.com/apache/druid/pull/13209#discussion_r992901606


##########
core/src/main/java/org/apache/druid/math/expr/vector/VectorProcessors.java:
##########
@@ -851,17 +861,19 @@ public void processIndex(
                 outputNulls[i] = true;
                 return;
               }
-              final boolean bool = Evals.asBoolean(rightInput[i]);
+              final boolean bool = Evals.asBoolean((String) rightInput[i]);
               output[i] = Evals.asLong(bool);
               outputNulls[i] = bool;
               return;
             } else if (rightNull) {
-              final boolean bool = Evals.asBoolean(leftInput[i]);
+              final boolean bool = Evals.asBoolean((String) leftInput[i]);

Review Comment:
   this safe because (for better or for worse) the base classes explicitly wrap their input processors with `CastToTypeVectorProcessor.cast` (which is a no-op if the processor is already the correct type) to ensure that it is being used with the correct type. I don't remember exactly why I did this, presumably because its important for vector processing that everything is the right type so we don't get ugly class cast exceptions, but I can look into if this is actually necessary or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #13209: use object[] instead of string[] for vector expressions to be consistent with vector object selectors

Posted by GitBox <gi...@apache.org>.
clintropolis merged PR #13209:
URL: https://github.com/apache/druid/pull/13209


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org