You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Itai Incze (JIRA)" <ji...@apache.org> on 2017/04/18 10:09:41 UTC
[jira] [Commented] (PARQUET-911) C++: Support nested structs in
parquet_arrow
[ https://issues.apache.org/jira/browse/PARQUET-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972447#comment-15972447 ]
Itai Incze commented on PARQUET-911:
------------------------------------
Been working on this a bit, and I have a working prototype for building nested structs, without proper handling of repetition yet.
What I'm wondering about right now before submitting a PR is what the API should be like -
in {{arrow/reader.h}} there's the method ReadColumn which refers to a flattened schema. Does it make sense to keep such an API side by side with a nested "column" read method?
On one hand, in the context of reading parquet, it might sound useful. On the other hand the whole notion of a "column" is not exactly an arrow terminology.
This reflects on other questions, for example - in {{ReadTable}}, what do the {{column_indices}} mean? indices to the schema tree leaves or the direct top level fields of the schema?
cc [~wesmckinn]
> C++: Support nested structs in parquet_arrow
> --------------------------------------------
>
> Key: PARQUET-911
> URL: https://issues.apache.org/jira/browse/PARQUET-911
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-cpp
> Reporter: Uwe L. Korn
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)