You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "westonpace (via GitHub)" <gi...@apache.org> on 2023/04/10 21:49:24 UTC

[GitHub] [arrow] westonpace commented on issue #34911: [C++] Add first and last aggregation

westonpace commented on issue #34911:
URL: https://github.com/apache/arrow/issues/34911#issuecomment-1502372242

   We spoke about this a bit externally but to record the conversation:
   
   `first` and `last` can be written in two different ways.  The unary version `first(expr)` is a window aggregate.  It is an aggregate function that depends on the order of the data.  There is also a binary variant (often called `arg_min` and `arg_max`) which returns the smallest value in column X given column Y.  The advantage of the binary variant is that it doesn't depend on the order.  However, it is a little less flexible (e.g. no way to specify custom sort function, not that we support that yet anyways :)).
   
   From our conversation, my understanding is that you are interested in the unary window-aggregate version.  We don't have any window aggregates yet but I do think it would be a good idea to start adding some.  We have pretty much all the building blocks we need to support a "window aggregate node" (or an extension to the current aggregate nodes).  Furthermore, even if we don't build in proper multithreaded support for window functions it should be possible to use window aggregates today as long as your plan is single threaded.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org