You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/09 18:42:14 UTC

[GitHub] [druid] jihoonson opened a new pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible

jihoonson opened a new pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible
URL: https://github.com/apache/druid/pull/9487
 
 
   ### Description
   
   The native filter is way faster than the expression filter. Here is a `ExpressionFilterBenchmark` result.
   
   ```
   Benchmark                                   (rowsPerSegment)  Mode  Cnt    Score   Error  Units
   ExpressionFilterBenchmark.expressionFilter           1000000  avgt   30  255.291 ± 1.034  ms/op
   ExpressionFilterBenchmark.nativeFilter               1000000  avgt   30    1.868 ± 0.005  ms/op
   ```
   
   This PR adds an optimization that transforms `array_contains()` and `array_overlaps()` into native filters if possible. For now, the optimization will be applied only when their parameters are a simple extraction and a literal. This is because the facility that traverses an `Expr` tree and converts it to a tree of native filters is missing. I think we could possibly add an optimization layer on the native query or between the sql layer and the native query layer, but it's not in the scope of this PR.
   
   I also added the behavior of the `IN` filter on multi-valued dimensions in the doc.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `ArrayOverlapOperatorConversion`
    * `ArrayContainsOperatorConversion`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible
URL: https://github.com/apache/druid/pull/9487#discussion_r389995694
 
 

 ##########
 File path: sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/ArrayOverlapOperatorConversion.java
 ##########
 @@ -51,4 +67,66 @@ public ArrayOverlapOperatorConversion()
   {
     super(SQL_FUNCTION, EXPR_FUNCTION);
   }
+
+  @Nullable
+  @Override
+  public DimFilter toDruidFilter(
+      final PlannerContext plannerContext,
+      RowSignature rowSignature,
+      @Nullable VirtualColumnRegistry virtualColumnRegistry,
+      final RexNode rexNode
+  )
+  {
+    final List<RexNode> operands = ((RexCall) rexNode).getOperands();
+    final List<DruidExpression> druidExpressions = Expressions.toDruidExpressions(
+        plannerContext,
+        rowSignature,
+        operands
+    );
+    if (druidExpressions == null) {
+      return null;
+    }
+
+    // Converts array_overlaps() function into an OR of Selector filters if possible.
+    final boolean leftSimpleExtractionExpr = druidExpressions.get(0).isSimpleExtraction();
+    final boolean rightSimpleExtractionExpr = druidExpressions.get(1).isSimpleExtraction();
+    final DruidExpression simpleExtractionExpr;
+    final DruidExpression complexExpr;
+
+    if (leftSimpleExtractionExpr ^ rightSimpleExtractionExpr) {
+      if (leftSimpleExtractionExpr) {
+        simpleExtractionExpr = druidExpressions.get(0);
+        complexExpr = druidExpressions.get(1);
+      } else {
+        simpleExtractionExpr = druidExpressions.get(1);
+        complexExpr = druidExpressions.get(0);
+      }
+    } else {
+      return toExpressionFilter(plannerContext, getDruidFunctionName(), druidExpressions);
+    }
+
+    Expr expr = Parser.parse(complexExpr.getExpression(), plannerContext.getExprMacroTable());
+    if (expr.isLiteral()) {
 
 Review comment:
   ah, else is different, carry on

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible
URL: https://github.com/apache/druid/pull/9487#discussion_r389969227
 
 

 ##########
 File path: sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/ArrayOverlapOperatorConversion.java
 ##########
 @@ -51,4 +67,66 @@ public ArrayOverlapOperatorConversion()
   {
     super(SQL_FUNCTION, EXPR_FUNCTION);
   }
+
+  @Nullable
+  @Override
+  public DimFilter toDruidFilter(
+      final PlannerContext plannerContext,
+      RowSignature rowSignature,
+      @Nullable VirtualColumnRegistry virtualColumnRegistry,
+      final RexNode rexNode
+  )
+  {
+    final List<RexNode> operands = ((RexCall) rexNode).getOperands();
+    final List<DruidExpression> druidExpressions = Expressions.toDruidExpressions(
+        plannerContext,
+        rowSignature,
+        operands
+    );
+    if (druidExpressions == null) {
+      return null;
+    }
+
+    // Converts array_overlaps() function into an OR of Selector filters if possible.
+    final boolean leftSimpleExtractionExpr = druidExpressions.get(0).isSimpleExtraction();
+    final boolean rightSimpleExtractionExpr = druidExpressions.get(1).isSimpleExtraction();
+    final DruidExpression simpleExtractionExpr;
+    final DruidExpression complexExpr;
+
+    if (leftSimpleExtractionExpr ^ rightSimpleExtractionExpr) {
+      if (leftSimpleExtractionExpr) {
+        simpleExtractionExpr = druidExpressions.get(0);
+        complexExpr = druidExpressions.get(1);
+      } else {
+        simpleExtractionExpr = druidExpressions.get(1);
+        complexExpr = druidExpressions.get(0);
+      }
+    } else {
+      return toExpressionFilter(plannerContext, getDruidFunctionName(), druidExpressions);
+    }
+
+    Expr expr = Parser.parse(complexExpr.getExpression(), plannerContext.getExprMacroTable());
+    if (expr.isLiteral()) {
 
 Review comment:
   nit: is it worth splitting this out into a static method on one of these classes (or a utility method somewhere) since this block looks basically shared with `array_contains`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson merged pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible

Posted by GitBox <gi...@apache.org>.
jihoonson merged pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible
URL: https://github.com/apache/druid/pull/9487
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org