You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/23 13:25:27 UTC

[GitHub] [arrow] nealrichardson commented on a diff in pull request #13150: ARROW-16549: [C++] Simplify AggregateNodeOptions aggregates/targets

nealrichardson commented on code in PR #13150:
URL: https://github.com/apache/arrow/pull/13150#discussion_r879450395


##########
r/R/query-engine.R:
##########
@@ -121,11 +119,13 @@ ExecPlan <- R6Class("ExecPlan",
             x
           })
         }
+        target_names <- names(.data$aggregations)
+        for (i in seq_len(length(target_names))) {
+          .data$aggregations[[i]][["name"]] <- .data$aggregations[[i]][["target"]] <- target_names[i]
+        }
 
         node <- node$Aggregate(
-          options = map(.data$aggregations, ~ .[c("fun", "options")]),
-          target_names = names(.data$aggregations),
-          out_field_names = names(.data$aggregations),
+          options = map(.data$aggregations, ~ .[c("fun", "options", "target", "name")]),

Review Comment:
   If we're accessing by name in the C++ bindings, we don't need to select/sort columns here and can just pass the list in.
   
   ```suggestion
             options = .data$aggregations,
   ```



##########
r/R/query-engine.R:
##########
@@ -91,7 +91,6 @@ ExecPlan <- R6Class("ExecPlan",
       grouped <- length(group_vars) > 0
 
       # Collect the target names first because we have to add back the group vars

Review Comment:
   Comment is no longer valid
   
   ```suggestion
   ```



##########
r/R/query-engine.R:
##########
@@ -121,11 +119,13 @@ ExecPlan <- R6Class("ExecPlan",
             x
           })
         }
+        target_names <- names(.data$aggregations)
+        for (i in seq_len(length(target_names))) {
+          .data$aggregations[[i]][["name"]] <- .data$aggregations[[i]][["target"]] <- target_names[i]

Review Comment:
   If "target" and "name" are always the same, why are we passing them twice? Can't we reuse the same `std::vector<std::string>` in the C++ bindings?
   
   Also, why put target names inside the aggregations elements when we're just going to pull them back out and make a vector in C++? If you pass `target_names` directly, you already have `std::vector<std::string>`.



##########
r/src/compute-exec.cpp:
##########
@@ -226,31 +226,27 @@ std::shared_ptr<compute::ExecNode> ExecNode_Project(
 // [[arrow::export]]
 std::shared_ptr<compute::ExecNode> ExecNode_Aggregate(
     const std::shared_ptr<compute::ExecNode>& input, cpp11::list options,
-    std::vector<std::string> target_names, std::vector<std::string> out_field_names,
     std::vector<std::string> key_names) {
   std::vector<arrow::compute::internal::Aggregate> aggregates;
   std::vector<std::shared_ptr<arrow::compute::FunctionOptions>> keep_alives;
-
   for (cpp11::list name_opts : options) {
-    auto name = cpp11::as_cpp<std::string>(name_opts[0]);
-    auto opts = make_compute_options(name, name_opts[1]);
+    auto function = cpp11::as_cpp<std::string>(name_opts[0]);
+    auto opts = make_compute_options(function, name_opts[1]);
+    auto target = cpp11::as_cpp<std::string>(name_opts[2]);
+    auto name = cpp11::as_cpp<std::string>(name_opts[3]);

Review Comment:
   I think we can access these by name instead of position
   
   ```suggestion
       auto function = cpp11::as_cpp<std::string>(name_opts["fun"]);
       auto opts = make_compute_options(function, name_opts["options"]);
       auto target = cpp11::as_cpp<std::string>(name_opts["target"]);
       auto name = cpp11::as_cpp<std::string>(name_opts["name"]);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org