You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2019/09/25 13:59:00 UTC
[jira] [Created] (ARROW-6690) [Rust] [DataFusion] HashAggregate
without GROUP BY should use SIMD
Andy Grove created ARROW-6690:
---------------------------------
Summary: [Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD
Key: ARROW-6690
URL: https://issues.apache.org/jira/browse/ARROW-6690
Project: Apache Arrow
Issue Type: Sub-task
Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Fix For: 1.0.0
Currently the implementation of HashAggregate in the new physical plan uses the same logic regardless of whether a grouping expression is used.
For the case where there is no grouping expression, such as "SELECT SUM(a) FROM b" we can use the compute kernels to perform an aggregate operation on each batch rather than iterating over each row and accumulating individual values.
This optimization already exists in the original implementation of aggregate queries direct from the logical plan.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)