You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/04 17:13:28 UTC

[GitHub] [incubator-pinot] kkrugler opened a new issue #6642: Add multi-valued field support to DistinctCountHLL

kkrugler opened a new issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642


   Currently the code in `DistinctCountHLLAggregationFunction.aggregate()` looks like:
   
   ``` java
     public void aggregate(int length, AggregationResultHolder aggregationResultHolder,
         Map<ExpressionContext, BlockValSet> blockValSetMap) {
       BlockValSet blockValSet = blockValSetMap.get(_expression);
       DataType valueType = blockValSet.getValueType();
   
       if (valueType != DataType.BYTES) {
         HyperLogLog hyperLogLog = getDefaultHyperLogLog(aggregationResultHolder);
         switch (valueType) {
           case INT:
             int[] intValues = blockValSet.getIntValuesSV();
             for (int i = 0; i < length; i++) {
               hyperLogLog.offer(intValues[i]);
             }
             break;
           case LONG:
            ...
   ```
   
   Note the call to `blockValueSet.getIntValuesSV()`. It should check if `BlockValueSet.isSingleValue()` returns false, and if so then use `BlockValueSet.getIntValuesMV()`. This will return a two dimensional array (`[][]`) of values that need iteration.
   
   The same change is needed for all of the other value types.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6642: Merge multi-valued field support from DistinctCountHHLMV into DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-791025513


   @fx19880617 We don't really need to rewrite. We have access to the column metadata during aggregation, and knows whether it is SV or MV


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on issue #6642: Merge multi-valued field support from DistinctCountHHLMV into DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-791031667


   Right, from the implementation side, we will eventually merge the behavior of `<function>` and `<function>MV`. Then we can rewrite all `<function>MV` to `<function>`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on issue #6642: Merge multi-valued field support from DistinctCountHHLMV into DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-790975586


   right, we should be able to infer the type with the table schema during query parsing phase and then rewrite the query if it's a multi-value column.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kkrugler commented on issue #6642: Merge multi-valued field support from DistinctCountHHLMV into DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
kkrugler commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-790857926


   I imagine similar merging could happen for many of the other [xxxMV](https://docs.pinot.apache.org/users/user-guide-query/supported-aggregations#multi-value-column-functions) multi-value column functions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6642: Add multi-valued field support to DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-790838990


   Currently, in order to aggregate on multi-valued fields, you need to use the MV version of the function: `DistinctCountHLLMVAggregationFunction`.
   In the future, we might merge these 2 versions to make it easier to use.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kkrugler commented on issue #6642: Merge multi-valued field support from DistinctCountHHLMV into DistinctCountHLL

Posted by GitBox <gi...@apache.org>.
kkrugler commented on issue #6642:
URL: https://github.com/apache/incubator-pinot/issues/6642#issuecomment-790853089


   Hi @Jackie-Jiang Updated issue to cover merging the two.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org