You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/21 04:34:29 UTC

[GitHub] [druid] clintropolis opened a new pull request #11280: bitwise aggregators, better null handling options for expression agg

clintropolis opened a new pull request #11280:
URL: https://github.com/apache/druid/pull/11280


   ### Description
   Builds on top of #11104 and #10605 to add bitwise aggregator functions:
   
   |Function|Notes|Default|
   |--------|-----|-------|
   |`BIT_AND(expr)`|Performs a bitwise AND operation on all input values.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`|
   |`BIT_OR(expr)`|Performs a bitwise OR operation on all input values.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`|
   |`BIT_XOR(expr)`|Performs a bitwise XOR operation on all input values.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`|
   
   In the process of adding this, I've also modified `ExpressionLambdaAggregatorFactory` to have an additional JSON property, 
   `initiallyNull`, which determines if the aggregator will produce a `null` value or `initialValue`/`initialCombineValue`. For example, an SQL compatible count aggregator would have `initiallyNull` set to `false` and have `initialValue` set to `0`, so that it would always return 0 even if no rows were aggregated, while a sum would have it set to `true` so that it would return `null` in the same case. For the buffer aggregator, this is tracked by setting a bit in the expression type byte which prefixes all of the serialized expressions, which is then cleared whenever the aggregate function is called. This change simplifies `ARRAY_AGG` since it was previously using a finalize expression to coerce empty results back to null, but now it can just naturally be initialized to null.
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a change in pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
rohangarg commented on a change in pull request #11280:
URL: https://github.com/apache/druid/pull/11280#discussion_r652594161



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/ExpressionLambdaAggregator.java
##########
@@ -29,12 +29,19 @@
   private final Expr lambda;
   private final ExpressionLambdaAggregatorInputBindings bindings;
   private final int maxSizeBytes;
+  private boolean uninitializedNullValue;
 
-  public ExpressionLambdaAggregator(Expr lambda, ExpressionLambdaAggregatorInputBindings bindings, int maxSizeBytes)
+  public ExpressionLambdaAggregator(
+      final Expr lambda,
+      final ExpressionLambdaAggregatorInputBindings bindings,
+      final boolean initiallyNull,
+      final int maxSizeBytes
+  )
   {
     this.lambda = lambda;
     this.bindings = bindings;
     this.maxSizeBytes = maxSizeBytes;
+    this.uninitializedNullValue = initiallyNull;

Review comment:
       I would suggest to name this as `useNullIfUninitialized` for clarity - the current name didn't seem like a boolean question to me. I would also prefer any other better name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #11280:
URL: https://github.com/apache/druid/pull/11280#issuecomment-868887647


   thanks for review @jihoonson and @rohangarg :metal:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a change in pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
rohangarg commented on a change in pull request #11280:
URL: https://github.com/apache/druid/pull/11280#discussion_r657163570



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/ExpressionLambdaAggregatorFactory.java
##########
@@ -195,6 +199,12 @@ public String getInitialCombineValueExpressionString()
     return initialCombineValueExpressionString;
   }
 
+  @JsonProperty("initiallyNull")

Review comment:
       this should also be changed to `isNullUnlessAggregated`

##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/ExpressionLambdaBufferAggregator.java
##########
@@ -21,73 +21,95 @@
 
 import org.apache.druid.math.expr.Expr;
 import org.apache.druid.math.expr.ExprEval;
+import org.apache.druid.math.expr.ExprType;
 
 import javax.annotation.Nullable;
 import java.nio.ByteBuffer;
 
 public class ExpressionLambdaBufferAggregator implements BufferAggregator
 {
+  private static final short NOT_AGGREGATED_BIT = 1 << 7;
+  private static final short IS_AGGREGATED_MASK = 0x3F;
+  private static final byte TYPE_MASK = 0x0F;

Review comment:
       is it possible to drop either `TYPE_MASK` or `IS_AGGREGATED_MASK` and use a common mask whose value is `0x0F` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #11280:
URL: https://github.com/apache/druid/pull/11280#discussion_r656999544



##########
File path: processing/src/main/java/org/apache/druid/query/aggregation/ExpressionLambdaAggregatorFactory.java
##########
@@ -121,6 +124,7 @@ public ExpressionLambdaAggregatorFactory(
 
     this.initialValueExpressionString = initialValue;
     this.initialCombineValueExpressionString = initialCombineValue == null ? initialValue : initialCombineValue;
+    this.initiallyNull = initiallyNull == null ? NullHandling.sqlCompatible() : initiallyNull;

Review comment:
       renamed to `isNullUnlessAggregated` to more clearly indicate that it is a boolean and hopefully indicate its main role in determining aggregator behavior. `initiallyNull` seemed confusing alongside `initialValue`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
clintropolis merged pull request #11280:
URL: https://github.com/apache/druid/pull/11280


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #11280: bitwise aggregators, better null handling options for expression agg

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #11280:
URL: https://github.com/apache/druid/pull/11280#discussion_r656997310



##########
File path: core/src/main/java/org/apache/druid/math/expr/ExprEval.java
##########
@@ -48,7 +49,7 @@
   public static ExprEval deserialize(ByteBuffer buffer, int position)
   {
     // | expression type (byte) | expression bytes |
-    ExprType type = ExprType.fromByte(buffer.get(position));
+    ExprType type = ExprType.fromByte((byte) (buffer.get(position) & TYPE_MASK));

Review comment:
       I reworked this to do the masking in the buffer aggregator instead of here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org