You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/02/18 11:07:59 UTC

[GitHub] [pinot] geeknarrator opened a new issue #8225: Multiple argument support for DISTINCTCOUNTHLL/DISTINCTCOUNTTHETASKETCH

geeknarrator opened a new issue #8225:
URL: https://github.com/apache/pinot/issues/8225


   Currently pinot only supports a single argument for distinct count hll and thetasketch functions. For some use cases like ours where we needed distinct count by two columns and `id` and a `date` it is very useful. 
   
   https://docs.pinot.apache.org/users/user-guide-query/how-to-handle-unique-counting


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #8225: Multiple argument support for DISTINCTCOUNTHLL/DISTINCTCOUNTTHETASKETCH

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #8225:
URL: https://github.com/apache/pinot/issues/8225#issuecomment-1045099301


   One work-around is to concat the two columns into a single string, e.g. `select distinctcounthll(concat(id, date, ',')) from myTable`. The performance could be sub-optimal because the dictionary/type-based optimization cannot be applied.
   
   To optimize the performance, one solution is to combine multiple values from different columns into a `byte[]` then feed into the hll/thetasketch. This should give better performance than representing numbers using strings


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org