You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/12/04 04:27:23 UTC

[GitHub] [druid] clintropolis opened a new pull request #12025: fix issues with multi-value string constant expressions

clintropolis opened a new pull request #12025:
URL: https://github.com/apache/druid/pull/12025


   ### Description
   This PR fixes two issues with multi-value string handling in SQL, both involving constant expressions on string columns which produce array types.
   
   The first issue is at the SQL layer, where an array producing expression such as `STRING_TO_MV` in an SQL that could be reduced to a constant would cause a casting exception because the `ARRAY<STRING>` typed output of the constant can not be cast to a string for the `VARCHAR` literal it is trying to create.
   
   Fixing this issue uncovered a second problem, this one down in the native engine, where dim selectors on constant expressions would end up using `ConstantDimensionSelector` which converts the expression result into a single string value, which is incorrect for an array typed expression. To handle this, a new `ConstantMultiValueDimensionSelector` has been added that accepts a `List<String>` as input and makes a constant selector on that, allowing array expression constants to work with dimension selectors.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on a change in pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on a change in pull request #12025:
URL: https://github.com/apache/druid/pull/12025#discussion_r763307166



##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionSelector.java
##########
@@ -164,6 +166,27 @@ static DimensionSelector constant(@Nullable final String value, @Nullable final
     }
   }
 
+  static DimensionSelector multiConstant(@Nullable final List<String> values)
+  {
+    if (values == null || values.isEmpty()) {
+      return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+    } else {
+      return new ConstantMultiValueDimensionSelector(values);
+    }
+  }
+
+  static DimensionSelector multiConstant(@Nullable final List<String> values, @Nullable final ExtractionFn extractionFn)
+  {
+    if (extractionFn == null) {
+      return multiConstant(values);
+    } else {
+      if (values == null) {
+        return constant(extractionFn.apply(null));

Review comment:
       What's the rationale behind doing this instead of `multiConstant(Collections.singletonList(extractionFn.apply(null))`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #12025:
URL: https://github.com/apache/druid/pull/12025#discussion_r764237349



##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionSelector.java
##########
@@ -164,6 +166,27 @@ static DimensionSelector constant(@Nullable final String value, @Nullable final
     }
   }
 
+  static DimensionSelector multiConstant(@Nullable final List<String> values)
+  {
+    if (values == null || values.isEmpty()) {
+      return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+    } else {
+      return new ConstantMultiValueDimensionSelector(values);
+    }
+  }
+
+  static DimensionSelector multiConstant(@Nullable final List<String> values, @Nullable final ExtractionFn extractionFn)
+  {
+    if (extractionFn == null) {
+      return multiConstant(values);
+    } else {
+      if (values == null) {
+        return constant(extractionFn.apply(null));

Review comment:
       added comments to explain everything and made a few behavior modifications to make sure `multiConstant` has consistent behavior with other multi-valued string selectors




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #12025:
URL: https://github.com/apache/druid/pull/12025#discussion_r763383844



##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionSelector.java
##########
@@ -164,6 +166,27 @@ static DimensionSelector constant(@Nullable final String value, @Nullable final
     }
   }
 
+  static DimensionSelector multiConstant(@Nullable final List<String> values)
+  {
+    if (values == null || values.isEmpty()) {
+      return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+    } else {
+      return new ConstantMultiValueDimensionSelector(values);
+    }
+  }
+
+  static DimensionSelector multiConstant(@Nullable final List<String> values, @Nullable final ExtractionFn extractionFn)
+  {
+    if (extractionFn == null) {
+      return multiConstant(values);
+    } else {
+      if (values == null) {
+        return constant(extractionFn.apply(null));

Review comment:
       to add to this, `constant` is a bit better than `multiConstant` because it reports its cardinality, which I guess I could do for `multiConstant` if i build and sort dictionary and do it that way, but didn't because i was lazy. So, unless I rework `multiConstant` to be a bit cooler then i think is better to use `constant`. Should I make `multiConstant` better? I guess it still doesn't answer if `null` is equivalent to `[null]` in all cases or not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #12025:
URL: https://github.com/apache/druid/pull/12025#discussion_r763320122



##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionSelector.java
##########
@@ -164,6 +166,27 @@ static DimensionSelector constant(@Nullable final String value, @Nullable final
     }
   }
 
+  static DimensionSelector multiConstant(@Nullable final List<String> values)
+  {
+    if (values == null || values.isEmpty()) {
+      return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+    } else {
+      return new ConstantMultiValueDimensionSelector(values);
+    }
+  }
+
+  static DimensionSelector multiConstant(@Nullable final List<String> values, @Nullable final ExtractionFn extractionFn)
+  {
+    if (extractionFn == null) {
+      return multiConstant(values);
+    } else {
+      if (values == null) {
+        return constant(extractionFn.apply(null));

Review comment:
       i originally had it like that, but am unsure if `null` and `[null]` are _actually_ the same thing. From the perspective of grouping, they are identical I guess so it probably doesn't matter. To me the most correct thing would be to _not_ apply the extraction at all and just make it a constant null and not a list with a null, but those are sort of the same thing for string columns currently...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
clintropolis merged pull request #12025:
URL: https://github.com/apache/druid/pull/12025


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #12025: fix issues with multi-value string constant expressions

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #12025:
URL: https://github.com/apache/druid/pull/12025#discussion_r764436410



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/ExpressionSelectors.java
##########
@@ -201,6 +201,9 @@ public static DimensionSelector makeDimensionSelector(
 
     if (baseSelector instanceof ConstantExprEvalSelector) {
       // Optimization for dimension selectors on constants.
+      if (plan.is(ExpressionPlan.Trait.NON_SCALAR_OUTPUT)) {
+        return DimensionSelector.multiConstant(Arrays.asList(baseSelector.getObject().asStringArray()), extractionFn);

Review comment:
       Should it pass null instead of a singleton list of null when `baseSelector.getObject().asStringArray()` is null?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org