You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/22 18:31:50 UTC

[GitHub] [arrow-datafusion] jorgecarleitao commented on issue #600: Allow User Defined Aggregates to return multiple values / structs

jorgecarleitao commented on issue #600:
URL: https://github.com/apache/arrow-datafusion/issues/600#issuecomment-866233559


   When I implemented todays' aggs, I Initially though about multiple return values, and then concluded that the Struct is sufficient and desirable. What I like about the struct is that it enables named fields, which imo makes the statements rather expressive. E.g. 
   
   ```
   df = df.agg(udaf.call("a").alias("a"))
   df.select(df["a"]["min"], df["a"]["max"])
   ```
   
   vs 
   
   ```
   df = df.agg(udaf.call("a").alias("a"))
   df.select(df["a"][0], df["a"][1])
   ```
   
   the context "min" and "max" imo helps the user at reading what they are extracting from the column.
   
   Would supporting structs for ScalarValues solve this nicely?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org