You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/12 14:09:09 UTC

[GitHub] [arrow] bkietz commented on a diff in pull request #13117: ARROW-16525: [C++] Tee node not properly marking node finished

bkietz commented on code in PR #13117:
URL: https://github.com/apache/arrow/pull/13117#discussion_r871418326


##########
cpp/src/arrow/compute/exec/exec_plan.cc:
##########
@@ -506,22 +506,49 @@ Result<ExecNode*> Declaration::AddToPlan(ExecPlan* plan,
   return node;
 }
 
+Declaration* Declaration::Root() {
+  Declaration* current = this;
+  while (!current->inputs.empty()) {
+    DCHECK_LE(current->inputs.size(), 1)
+        << "No clear root when a declaration has multiple inputs";
+    auto& input = current->inputs[0];
+    Declaration* maybe_next = input.get<Declaration>();
+    DCHECK(maybe_next) << "Attempt to get root when part of declaration is already built";
+    current = maybe_next;
+  }
+  return current;
+}
+
 Declaration Declaration::Sequence(std::vector<Declaration> decls) {
   DCHECK(!decls.empty());
 
   Declaration out = std::move(decls.back());
   decls.pop_back();
-  auto receiver = &out;
+  auto receiver = out.Root();

Review Comment:
   FWIW, the original intent of Sequence() wrt multiple inputs was:
   ```c++
   Declaration join = Declaration::Sequence({left_nodes...});
   join = Declaration::Sequence({right_nodes..., join});
   // appends the sequence right_nodes... as the second input of join
   ```
   it seems less than ideal to always resolve the root node.
   
   In cases where one does need the root node to be resolved that'd be easy enough with
   ```diff
   -Declaration decl = Declaration::Sequence({prefix..., has_root});
   +Declaration* root = has_root.Root();
   +*root = Declaration::Sequence({prefix..., std::move(*root)});
   ```



##########
cpp/src/arrow/compute/exec/exec_plan.cc:
##########
@@ -506,22 +506,49 @@ Result<ExecNode*> Declaration::AddToPlan(ExecPlan* plan,
   return node;
 }
 
+Declaration* Declaration::Root() {
+  Declaration* current = this;
+  while (!current->inputs.empty()) {
+    DCHECK_LE(current->inputs.size(), 1)
+        << "No clear root when a declaration has multiple inputs";

Review Comment:
   DCHECKing here seems overly aggressive since we don't have anything like `bool has_root()`. Instead could we have
   - Root() returns nullptr when no root can be found
   - Roots() returns a vector of all decls with no inputs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org