You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Todd Farmer (Jira)" <ji...@apache.org> on 2022/09/26 16:51:00 UTC

[jira] [Commented] (ARROW-16915) [C++] Unify approaches to attach schemas on record batches exiting Acero

    [ https://issues.apache.org/jira/browse/ARROW-16915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609590#comment-17609590 ] 

Todd Farmer commented on ARROW-16915:
-------------------------------------

This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

> [C++] Unify approaches to attach schemas on record batches exiting Acero
> ------------------------------------------------------------------------
>
>                 Key: ARROW-16915
>                 URL: https://issues.apache.org/jira/browse/ARROW-16915
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Assignee: Vibhatha Lakmal Abeykoon
>            Priority: Major
>
> Internally, Acero uses ExecBatch everywhere, without schemas.  Originally, the various exit nodes would simply attach a boring schema based on the output data types and an inference of field names.
> However, as part of Substrait integration and other improvements the various sink nodes are being amended to support:
>  * Custom field names
>  * Custom metadata
> However, the current implementation is somewhat inconsistent.
> SinkNode:
>  - Does not support custom field names or metadata
> ConsumingSinkNode:
>  - Supports custom names but not custom metadata
> WriteNode
>  - Supports custom metadata but not custom names
> We should create a {{SinkNodeOptions}} base class that supports custom names and custom metadata and we should have a single place with utility methods for attaching a schema to an outgoing exec batch.  Then all of our sink nodes should use this single tool for modifying outgoing batches.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)