You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/23 02:35:00 UTC

[jira] [Updated] (ORC-960) Create SearchArgument using column ids

     [ https://issues.apache.org/jira/browse/ORC-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Quanlong Huang updated ORC-960:
-------------------------------
    Affects Version/s: 1.7.0

> Create SearchArgument using column ids
> --------------------------------------
>
>                 Key: ORC-960
>                 URL: https://issues.apache.org/jira/browse/ORC-960
>             Project: ORC
>          Issue Type: New Feature
>          Components: C++
>    Affects Versions: 1.7.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> Currently, SearchArguments are created using column names, e.g. in orc/sargs/SearchArgument.hh
> {code:cpp}
> virtual SearchArgumentBuilder& lessThan(const std::string& column,
>                                         PredicateDataType type,
>                                         Literal literal) = 0;{code}
> The name string is the leaf field name which can be duplicated if there are nested types, e.g.
> {code:sql}
> id int
> s1 struct<id:int,name:string>
> s2 struct<id:int,name:string>
> {code}
> There are 3 leaf columns using name 'id'. The current code of resolving the column name can only found the first matched one:
> {code:cpp}
>   // find column id from column name
>   uint64_t SargsApplier::findColumn(const Type& type,
>                                     const std::string& colName) {
>     for (uint64_t i = 0; i != type.getSubtypeCount(); ++i) {
>       if (type.getFieldName(i) == colName) {
>         return type.getSubtype(i)->getColumnId();
>       } else {
>         uint64_t ret = findColumn(*type.getSubtype(i), colName);
>         if (ret != INVALID_COLUMN_ID) {
>           return ret;
>         }
>       }
>     }
>     return INVALID_COLUMN_ID;
>   }
> {code}
> [https://github.com/apache/orc/blob/2dcbd6281e2fbeeaf0ffe46aa3b78cd3df96ed62/c%2B%2B/src/sargs/SargsApplier.cc#L25]
> Since what we need is actually the column id, let's provide intefaces for column ids directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)