You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/04 13:59:52 UTC

[GitHub] [arrow-datafusion] alamb opened a new pull request #256: Implement count distinct for dictionary arrays

alamb opened a new pull request #256:
URL: https://github.com/apache/arrow-datafusion/pull/256


   # Which issue does this PR close?
   
   Closes https://github.com/apache/arrow-datafusion/issues/249
   
   # Rationale for this change
   I have dictionary encoded data that I want to be able to compute distinct counts
   
   # What changes are included in this PR?
   This implementation has the the basic "get it working" version. A more optimized implementation (e.g. only looking at the dictionary contents) is left for a subsequent PR and I will file another ticket to track better the performance
   
   # Are there any user-facing changes?
   queries like `select count(distinct tag_col) from ...` are now supported when `tag_col` is a dictionary encoded column
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb merged pull request #256: Implement count distinct for dictionary arrays

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #256:
URL: https://github.com/apache/arrow-datafusion/pull/256


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org