You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2022/01/20 19:42:00 UTC
[jira] [Commented] (ARROW-15271) [R] Refactor do_exec_plan to return a RecordBatchReader
[ https://issues.apache.org/jira/browse/ARROW-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479636#comment-17479636 ]
Dewey Dunnington commented on ARROW-15271:
------------------------------------------
Just collecting a few related code comments here:
- https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/dataset-scan.R#L89
- https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/query-engine.R#L23-L26
- https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/dataset-scan.R#L184
Related is the ability to write files directly in a query plan using the {{WriteNode}} that was added in ARROW-13542. For example, there is a ticket open for using the {{WriteNode}} to write data sets (ARROW-14266). Writing files is useful but perhaps orthogonal to the ability to iterate over a {{RecordBatchReader}}, which is exemplified by the revamped {{map_batches()}} + vignette addition.
> [R] Refactor do_exec_plan to return a RecordBatchReader
> -------------------------------------------------------
>
> Key: ARROW-15271
> URL: https://issues.apache.org/jira/browse/ARROW-15271
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Affects Versions: 6.0.1
> Reporter: Will Jones
> Priority: Major
>
> Right now [{{do_exec_plan}}|https://github.com/apache/arrow/blob/master/r/R/query-engine.R#L18] returns an Arrow table because {{head}}, {{tail}}, and {{arrange}} do. If ARROW-14289 is completed and similar work is done for {{arrange}}, we may be able to alter {{do_exec_plan}} to return a RBR instead.
> The {{map_batches()}} implementation (ARROW-14029) could benefit from this refactor. And it might make ARROW-15040 more useful.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)