You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "metesynnada (via GitHub)" <gi...@apache.org> on 2023/04/24 08:32:02 UTC

[GitHub] [arrow-datafusion] metesynnada commented on pull request #6049: MemoryExec INSERT INTO refactor to use ExecutionPlan

metesynnada commented on PR #6049:
URL: https://github.com/apache/arrow-datafusion/pull/6049#issuecomment-1519624238

   Thank you for reviewing and providing valuable feedback. I appreciate your input on the change to `TableProvider::insert_into` to accept an `ExecutionPlan`. My intention behind this implementation was to align with the discussion in the #5076, which suggests re-implementing DataFrame.write_* methods to use a LogicalPlan::Write operation along with a physical operator to perform the write.
   
   The rationale behind this approach is to create a more unified and extensible architecture for handling write operations, similar to other major engines like Apache Spark, which incorporates write operations in both logical and physical plans.
   
   I understand your concerns about creating an `ExecutionPlan` as output, specifically regarding the methods that might not make sense in this context, like "output ordering" and "preserves input." In light of the related issue, I believe that integrating write operations into the `ExecutionPlan` can help address the need for a more generic `LogicalPlan::Write ` operation. `INSERT INTO` support was just the beginning.
   
   If you believe that simplifying the PR without losing functionality is the best course of action, I am willing to make the necessary changes. However, I would appreciate further discussion on how to align this PR with the goals outlined in issue #5076 while addressing your concerns about the resulting `ExecutionPlan`.
   
   Thank you again for your feedback, and I look forward to refining this PR with your assistance and in accordance with the project's roadmap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org