You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "Dandandan (via GitHub)" <gi...@apache.org> on 2023/04/07 14:45:24 UTC

[GitHub] [arrow-datafusion] Dandandan commented on pull request #5912: [Minor] Cleanup `SumAccumulator`

Dandandan commented on PR #5912:
URL: https://github.com/apache/arrow-datafusion/pull/5912#issuecomment-1500353430

   > Thanks @Dandandan for this PR. As fas as I know, `count` stores the element number in the sum. When count is 0, we should produce `NULL`. Consider expression `SUM(inc_col) OVER(ORDER BY ts ASC ROWS BETWEEN 2 FOLLOWING AND 3 FOLLOWING) as sum1` for the last 2 results we should produce `NULL`. However, without storing count, we cannot differentiate `NULL`, and `0`. In the failing test, this can be seen also. (Executor produces 0, where it should have been produced `NULL`)
   > 
   > Maybe there is way to accomplish this without storing count. I just wanted to mention, current use of it.
   
   Thanks. Yeah - I just saw this usage of it and the related failing test.
   
   One way to accomplish it would be differentating window and conventional aggregations, or maybe it could be handled inside the window function code itself 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org