You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2018/08/30 21:00:01 UTC

[jira] [Commented] (ARROW-3151) [C++] Create Protocol Buffers interface for iterating over the semantic "rows" of a record batch, and accessing the rows using the protobuf API

    [ https://issues.apache.org/jira/browse/ARROW-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597920#comment-16597920 ] 

Paul Rogers commented on ARROW-3151:
------------------------------------

Let's see if we can coordinate on this. I'm starting work on a proposal for a "RowSet" interface to be ported over from Drill that provides a simple row-based API to read from, and write to, vectors. On the write site, the mechanism also enforces memory limits, which is the key reason Drill created the "RowSet" abstraction.

Given that this project will need a way to assemble a row from a bundle of vectors, the "columnar-to-row" mechanism of RowSet might be a way to populate the row buffer.

On the other hand, the RowSet code from Drill is in Java, this is C++. Still, might make sense to port the mechanism to C++ so it can be used in multiple contexts.

Any background docs I could read to get a better understanding of the project context to determine if what was just said above makes sense in this context? Thanks.

> [C++] Create Protocol Buffers interface for iterating over the semantic "rows" of a record batch, and accessing the rows using the protobuf API
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-3151
>                 URL: https://issues.apache.org/jira/browse/ARROW-3151
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>
> The desired workflow:
> * User writes a .proto file describing the structure of a "row" as a Message
> * Given the generated pb.h bindings, an Arrow users can iterate over an {{arrow::RecordBatch}}, each iteration populating an instance of the Row message
> * The values of the row can then be accessed via the standard Protobuf APIs
> A corresponding interface could be developed to write a RecordBatch using protobufs as input, but that could be its own project



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)