You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/24 06:07:33 UTC

[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10312: Optimize large InDimFilters

abhishekagarwal87 commented on a change in pull request #10312:
URL: https://github.com/apache/druid/pull/10312#discussion_r475361509



##########
File path: processing/src/main/java/org/apache/druid/query/filter/InDimFilter.java
##########
@@ -143,10 +142,11 @@ private InDimFilter(
 
     // The values set can be huge. Try to avoid copying the set if possible.
     // Note that we may still need to copy values to a list for caching. See getCacheKey().
-    if ((NullHandling.sqlCompatible() || values.stream().noneMatch(NullHandling::needsEmptyToNull))) {
+    if (NullHandling.sqlCompatible() || !values.remove("")) {
       this.values = values;
     } else {
-      this.values = values.stream().map(NullHandling::emptyToNullIfNeeded).collect(Collectors.toSet());
+      values.add(null);

Review comment:
       In most likelihood, the `values` is a `HashSet` which is internally backed by a `HashMap` and hence `remove, `contains` will be much faster than linear scan. 
   
   ```suggestion
         if ((NullHandling.sqlCompatible()) {
         this.values = values;
       } else if (values.remove("")) {
           values.add(null);
       }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org