You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/27 18:31:40 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6769: Implement serialization for UDWF and UDAF in plan protobuf

alamb commented on code in PR #6769:
URL: https://github.com/apache/arrow-datafusion/pull/6769#discussion_r1244184033


##########
datafusion/proto/src/logical_plan/to_proto.rs:
##########
@@ -585,16 +585,16 @@ impl TryFrom<&Expr> for protobuf::LogicalExprNode {
                         )
                     }
                     // TODO: Tracked in https://github.com/apache/arrow-datafusion/issues/4584

Review Comment:
   Perhaps we can remove the TODO comments as well



##########
datafusion/proto/src/logical_plan/mod.rs:
##########
@@ -2786,12 +2786,66 @@ mod roundtrip_tests {
             vec![col("col1")],
             vec![col("col1")],
             vec![col("col2")],
+            row_number_frame.clone(),
+        ));
+        #[derive(Debug)]
+        struct Dummy {}
+
+        impl Accumulator for Dummy {
+            fn state(&self) -> datafusion::error::Result<Vec<ScalarValue>> {
+                Ok(vec![])
+            }
+
+            fn update_batch(
+                &mut self,
+                _values: &[ArrayRef],
+            ) -> datafusion::error::Result<()> {
+                Ok(())
+            }
+
+            fn merge_batch(
+                &mut self,
+                _states: &[ArrayRef],
+            ) -> datafusion::error::Result<()> {
+                Ok(())
+            }
+
+            fn evaluate(&self) -> datafusion::error::Result<ScalarValue> {
+                Ok(ScalarValue::Float64(None))
+            }
+
+            fn size(&self) -> usize {
+                std::mem::size_of_val(self)
+            }
+        }
+
+        let dummy_agg = create_udaf(
+            // the name; used to represent it in plan descriptions and in the registry, to use in SQL.
+            "dummy_agg",
+            // the input type; DataFusion guarantees that the first entry of `values` in `update` has this type.
+            DataType::Float64,
+            // the return type; DataFusion expects this to match the type returned by `evaluate`.
+            Arc::new(DataType::Float64),
+            Volatility::Immutable,
+            // This is the accumulator factory; DataFusion uses it to create new accumulators.
+            Arc::new(|_| Ok(Box::new(Dummy {}))),
+            // This is the description of the state. `state()` must match the types here.
+            Arc::new(vec![DataType::Float64, DataType::UInt32]),
+        );
+
+        let test_expr5 = Expr::WindowFunction(expr::WindowFunction::new(
+            WindowFunction::AggregateUDF(Arc::new(dummy_agg.clone())),
+            vec![col("col1")],
+            vec![col("col1")],
+            vec![col("col2")],
             row_number_frame,
         ));
+        ctx.register_udaf(dummy_agg);

Review Comment:
   THis is great!
   
   Now all we need is a test for the `WindowUDF` (aka `WindowFunction::WindowUDF`) and I think this PR is ready to go!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org