You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/03/15 17:36:57 UTC

[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #8351: Add more aggregations that can be solved with dictionary

Jackie-Jiang commented on a change in pull request #8351:
URL: https://github.com/apache/pinot/pull/8351#discussion_r827247517



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/query/DictionaryBasedAggregationOperator.java
##########
@@ -76,18 +80,52 @@ protected IntermediateResultsBlock getNextBlock() {
       int dictionarySize = dictionary.length();
       switch (aggregationFunction.getType()) {
         case MIN:
+        case MINMV:
           aggregationResults.add(toDouble(dictionary.getMinVal()));
           break;
         case MAX:
+        case MAXMV:
           aggregationResults.add(toDouble(dictionary.getMaxVal()));
           break;
         case MINMAXRANGE:
+        case MINMAXRANGEMV:
           aggregationResults.add(
               new MinMaxRangePair(toDouble(dictionary.getMinVal()), toDouble(dictionary.getMaxVal())));
           break;
         case DISTINCTCOUNT:
+        case DISTINCTCOUNTMV:
           aggregationResults.add(getDistinctValueSet(dictionary));
           break;
+        case DISTINCTCOUNTHLL:
+        case DISTINCTCOUNTHLLMV:
+        case DISTINCTCOUNTRAWHLL:
+        case DISTINCTCOUNTRAWHLLMV: {
+          HyperLogLog hll;
+          if (dictionary.getValueType() == FieldSpec.DataType.BYTES) {
+            // Treat BYTES value as serialized HyperLogLog
+            try {
+              hll = ObjectSerDeUtils.HYPER_LOG_LOG_SER_DE.deserialize(dictionary.getBytesValue(0));
+              for (int dictId = 1; dictId < dictionarySize; dictId++) {
+                hll.addAll(ObjectSerDeUtils.HYPER_LOG_LOG_SER_DE.deserialize(dictionary.getBytesValue(dictId)));
+              }
+            } catch (Exception e) {
+              throw new RuntimeException("Caught exception while merging HyperLogLogs", e);
+            }
+          } else {
+            int log2m;
+            if (aggregationFunction instanceof DistinctCountHLLAggregationFunction) {
+              log2m = ((DistinctCountHLLAggregationFunction) aggregationFunction).getLog2m();
+            } else {
+              log2m = ((DistinctCountRawHLLAggregationFunction) aggregationFunction).getLog2m();
+            }
+            hll = new HyperLogLog(log2m);
+            for (int dictId = 0; dictId < dictionarySize; dictId++) {
+              hll.offer(dictionary.get(dictId));
+            }
+          }
+          aggregationResults.add(hll);
+          break;
+        }

Review comment:
       Good point, extracted the logic into a function




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org