You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Chunyang Wen (JIRA)" <ji...@apache.org> on 2016/08/22 08:00:37 UTC

[jira] [Assigned] (ORC-97) Support column name selection in ReaderOptions

     [ https://issues.apache.org/jira/browse/ORC-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chunyang Wen reassigned ORC-97:
-------------------------------

    Assignee: Chunyang Wen

> Support column name selection in ReaderOptions
> ----------------------------------------------
>
>                 Key: ORC-97
>                 URL: https://issues.apache.org/jira/browse/ORC-97
>             Project: Orc
>          Issue Type: New Feature
>          Components: C++
>    Affects Versions: 1.2.0
>            Reporter: Chunyang Wen
>            Assignee: Chunyang Wen
>
> After orc-92 patch, column id selection is supported. But actually select sub-type by name is more useful.
> In my project, we use period(.) to separate nested field names.
> <s1:struct<s2:struct<int1: int>>>
> we choose int1 by s1.s2.int1 which will be passed include(std::list<std:string>).
> In my implementation: first I build a map for name and column id, and then   direct the function call to includeTypes. If this is a candidate solution, I will provide a patch for review soon.
> When a sub-type is selected, all his child types should be selected also, as O'Malley pointed out in orc-92.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)