You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/06 18:55:29 UTC

[GitHub] [arrow] drusso opened a new pull request #8606: ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries.

drusso opened a new pull request #8606:
URL: https://github.com/apache/arrow/pull/8606


   [ARROW-10510](https://issues.apache.org/jira/browse/ARROW-10510)
   
   This change adds benchmarks for `COUNT(DISTINCT)` queries. This is a small follow-up to [ARROW-10043](https://issues.apache.org/jira/browse/ARROW-10043) / #8222. In that PR, a number of implementation ideas were discussed for follow-ups, and having benchmarks will help evaluate them. 
   
   ---
   
   There are two benchmarks added:
   
   * wide: all of the values are distinct; this is looking at worst-case performance
   * narrow: only a handful of distinct values; this is closer to best-case performance
   
   The wide benchmark runs ~ 7x slower than the narrow benchmark. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8606: ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8606:
URL: https://github.com/apache/arrow/pull/8606#issuecomment-723248595


   https://issues.apache.org/jira/browse/ARROW-10510


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me closed pull request #8606: ARROW-10510: [Rust] [DataFusion] Benchmark COUNT(DISTINCT) queries.

Posted by GitBox <gi...@apache.org>.
nevi-me closed pull request #8606:
URL: https://github.com/apache/arrow/pull/8606


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org