You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/07/13 14:50:30 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6954: Allow better vectorization in accumulate functions

alamb commented on code in PR #6954:
URL: https://github.com/apache/arrow-datafusion/pull/6954#discussion_r1262676664


##########
datafusion/physical-expr/src/aggregate/groups_accumulator/accumulate.rs:
##########
@@ -139,10 +139,14 @@ impl NullState {
             // no nulls, no filter,
             (false, None) => {
                 let iter = group_indices.iter().zip(data.iter());
+
                 for (&group_index, &new_value) in iter {
-                    seen_values.set_bit(group_index, true);
                     value_fn(group_index, new_value);
                 }
+                // update seen values in separate loop

Review Comment:
   the idea is to use a separate loop
   
   Note we can skip this update entirely once all the groups have been seen -- I will explore if I can figure out how to do that next. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org