You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/07 20:55:14 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6592: Add additional docstrings to Window function implementations

alamb commented on code in PR #6592:
URL: https://github.com/apache/arrow-datafusion/pull/6592#discussion_r1222156771


##########
datafusion/physical-expr/src/window/mod.rs:
##########
@@ -32,6 +32,7 @@ mod window_frame_state;
 pub use aggregate::PlainAggregateWindowExpr;
 pub use built_in::BuiltInWindowExpr;
 pub use built_in_window_function_expr::BuiltInWindowFunctionExpr;
+pub use partition_evaluator::PartitionEvaluator;

Review Comment:
   `PartitionEvaluator` now appears in public docstrings (and I propose to make it part of the user defined window function API -- see https://github.com/apache/arrow-datafusion/issues/5781#issuecomment-1581428228)



##########
datafusion/physical-expr/src/window/built_in_window_function_expr.rs:
##########
@@ -69,6 +73,8 @@ pub trait BuiltInWindowFunctionExpr: Send + Sync + std::fmt::Debug {
         false
     }
 
+    /// If returns true, [`Self::create_evaluator`] must implement

Review Comment:
   I was a little unclear on why `use_window_frame` was part of `BuiltInWindowFunctionExpr` and not `PartitionEvaluator` but I haven't looked into it in more detail



##########
datafusion/physical-expr/src/window/sliding_aggregate.rs:
##########
@@ -36,12 +36,10 @@ use crate::{
     expressions::PhysicalSortExpr, reverse_order_bys, AggregateExpr, PhysicalExpr,
 };
 
-/// A window expr that takes the form of an aggregate function
-/// Aggregate Window Expressions that have the form
-/// `OVER({ROWS | RANGE| GROUPS} BETWEEN UNBOUNDED PRECEDING AND ...)`
-/// e.g cumulative window frames uses `PlainAggregateWindowExpr`. Where as Aggregate Window Expressions
-/// that have the form `OVER({ROWS | RANGE| GROUPS} BETWEEN M {PRECEDING| FOLLOWING} AND ...)`
-/// e.g sliding window frames uses `SlidingAggregateWindowExpr`.
+/// A window expr that takes the form of an aggregate function that
+/// can be incrementally computed over sliding windows.
+///
+/// See comments on [`WindowExpr`] for more details.

Review Comment:
   Consolidated this description into `WindowExpr`



##########
datafusion/physical-expr/src/window/partition_evaluator.rs:
##########
@@ -25,24 +25,70 @@ use datafusion_common::{DataFusionError, ScalarValue};
 use std::fmt::Debug;
 use std::ops::Range;
 
-/// Partition evaluator
+/// Partition evaluator for Window Functions
+///
+/// An implementation of this trait is created and used for each
+/// partition defined by the OVER clause.
+///
+/// For example, evaluating `window_func(val) OVER (PARTITION BY col)`
+/// on the following data:
+///
+/// ```text
+/// col | val
+/// --- + ----
+///  A  | 1
+///  A  | 1
+///  C  | 2
+///  D  | 3
+///  D  | 3
+/// ```
+///
+/// Will instantiate three `PartitionEvaluator`s, one each for the
+/// partitions defined by `col=A`, `col=B`, and `col=C`.
+///
+/// There are two types of `PartitionEvaluator`:
+///
+/// # Stateless `PartitionEvaluator`
+///

Review Comment:
   @mustafasrepo  / @ozankabak  if you have time to help me describe more clearly what Stateful and Stateless `PartitionEvaluator`s are , and specifically what is different between them, I would be most appreciative



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org