You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "icexelloss (via GitHub)" <gi...@apache.org> on 2023/06/01 15:21:36 UTC

[GitHub] [arrow] icexelloss commented on a diff in pull request #35514: GH-35515: [C++][Python] Add non decomposable aggregation UDF

icexelloss commented on code in PR #35514:
URL: https://github.com/apache/arrow/pull/35514#discussion_r1213322578


##########
python/pyarrow/src/arrow/python/udf.cc:
##########
@@ -234,6 +351,56 @@ Status RegisterTabularFunction(PyObject* user_function, UdfWrapperCallback wrapp
       wrapper, options, registry);
 }
 
+Status AddAggKernel(std::shared_ptr<compute::KernelSignature> sig, compute::KernelInit init,
+                           compute::ScalarAggregateFunction* func) {
+
+  compute::ScalarAggregateKernel kernel(std::move(sig), std::move(init), AggregateUdfConsume, AggregateUdfMerge, AggregateUdfFinalize, /*ordered=*/false);
+  RETURN_NOT_OK(func->AddKernel(std::move(kernel)));
+  return Status::OK();
+}
+
+Status RegisterAggregateFunction(PyObject* agg_function, UdfWrapperCallback agg_wrapper,
+						const UdfOptions& options,
+						compute::FunctionRegistry* registry) {
+  if (!PyCallable_Check(agg_function)) {
+    return Status::TypeError("Expected a callable Python object.");
+  }
+
+  if (registry == NULLPTR) {
+    registry = compute::GetFunctionRegistry();
+  }
+
+  static auto default_scalar_aggregate_options = compute::ScalarAggregateOptions::Defaults();
+  auto aggregate_func = std::make_shared<compute::ScalarAggregateFunction>(
+      options.func_name, options.arity, options.func_doc, &default_scalar_aggregate_options);
+
+  Py_INCREF(agg_function);

Review Comment:
   I am actually no sure about how Py reference counting should work so I think I could use some help here. Currently I wrote this by trial and error and found if I don't have the Py_INCREF here it would segfault. 
   
   @westonpace What is the best way for me to understand how Py_INCREF and OwnedRef are supposed to work here?



##########
python/pyarrow/src/arrow/python/udf.cc:
##########
@@ -234,6 +351,56 @@ Status RegisterTabularFunction(PyObject* user_function, UdfWrapperCallback wrapp
       wrapper, options, registry);
 }
 
+Status AddAggKernel(std::shared_ptr<compute::KernelSignature> sig, compute::KernelInit init,
+                           compute::ScalarAggregateFunction* func) {
+
+  compute::ScalarAggregateKernel kernel(std::move(sig), std::move(init), AggregateUdfConsume, AggregateUdfMerge, AggregateUdfFinalize, /*ordered=*/false);
+  RETURN_NOT_OK(func->AddKernel(std::move(kernel)));
+  return Status::OK();
+}
+
+Status RegisterAggregateFunction(PyObject* agg_function, UdfWrapperCallback agg_wrapper,
+						const UdfOptions& options,
+						compute::FunctionRegistry* registry) {
+  if (!PyCallable_Check(agg_function)) {
+    return Status::TypeError("Expected a callable Python object.");
+  }
+
+  if (registry == NULLPTR) {
+    registry = compute::GetFunctionRegistry();
+  }
+
+  static auto default_scalar_aggregate_options = compute::ScalarAggregateOptions::Defaults();
+  auto aggregate_func = std::make_shared<compute::ScalarAggregateFunction>(
+      options.func_name, options.arity, options.func_doc, &default_scalar_aggregate_options);
+
+  Py_INCREF(agg_function);

Review Comment:
   I am actually no sure about how Py reference counting should work so I think I could use some help here. Currently I wrote this by trial and error and found if I don't have the Py_INCREF here it would segfault. 
   
   @westonpace What is the best way for me to understand how Py_INCREF and OwnedRef are supposed to work here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org