You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/24 23:43:02 UTC

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #375: add window expression stream, delegated window aggregation, and a basic structure for row_number

Jimexist commented on a change in pull request #375:
URL: https://github.com/apache/arrow-datafusion/pull/375#discussion_r638357746



##########
File path: datafusion/src/physical_plan/mod.rs
##########
@@ -457,10 +457,41 @@ pub trait WindowExpr: Send + Sync + Debug {
     fn name(&self) -> &str {
         "WindowExpr: default name"
     }
+
+    /// the accumulator used to accumulate values from the expressions.
+    /// the accumulator expects the same number of arguments as `expressions` and must
+    /// return states with the same description as `state_fields`
+    fn create_accumulator(&self) -> Result<Box<dyn WindowAccumulator>>;
+
+    /// expressions that are passed to the WindowAccumulator.
+    /// Single-column aggregations such as `sum` return a single value, others (e.g. `cov`) return many.
+    fn expressions(&self) -> Vec<Arc<dyn PhysicalExpr>>;
+}
+
+/// A window expression that is a built-in window function
+pub trait BuiltInWindowFunctionExpr: Send + Sync + Debug {

Review comment:
       i admit the naming is really hard.
   
   for `WindowExpr` i'm trying to use delegation so it in fact would have three subsets:
   1. `BuiltInWindowFunctionExpr` that are e.g. `row_number`, `lag`, or `cume_dist`
   2. `AggregateExpr` that are `sum`, `max`, or `min`
   3.  in future UDAF
   
   For 2 it's already implemented in group by aggregation so i'd like to reuse it in window functions; for that reason `BuiltInWindowFunctionExpr` handles the rest of the enum cases.
   
   I can probably move `BuiltInWindowFunctionExpr` into `windows.rs` to reduce visibility? or any other suggestion for names?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org