You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Chunyang Wen (JIRA)" <ji...@apache.org> on 2016/08/11 04:13:20 UTC

[jira] [Commented] (ORC-92) Support column id and column name selection in ReaderOptions

    [ https://issues.apache.org/jira/browse/ORC-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416514#comment-15416514 ] 

Chunyang Wen commented on ORC-92:
---------------------------------

I have provided a pull request to partial solve this problem. It supports nested column id selection. 

https://github.com/apache/orc/pull/54

For nested column name selection, we have decide first how to describe it.

<s1:struct<s2:struct<int1: int>>>

int1 is described as s1.s2.int1? (which will be used in include function of ReaderOptions)

ReaderOptions.include(std::list<std::string> include) function.

> Support column id and column name selection in ReaderOptions
> ------------------------------------------------------------
>
>                 Key: ORC-92
>                 URL: https://issues.apache.org/jira/browse/ORC-92
>             Project: Orc
>          Issue Type: New Feature
>          Components: C++
>    Affects Versions: 1.2.0
>            Reporter: Chunyang Wen
>            Priority: Minor
>
> Currently, in C++ version of orc. We can only select by filed id or field name. This works fine when data structure is flat such as struct<int1:int, s1:string, list1:array<int>>. But when we have a nested structure, struct<int1:int, struct1:struct<int2:int, long2:long>>. We still can only select the field of int1 and struct1. We can not directly select long2.
> We can select long2 by its column id. This can be achieved by updating include function in ReaderOptions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)