You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/22 09:16:00 UTC
[jira] [Created] (ORC-960) Create SearchArgument using column ids
Quanlong Huang created ORC-960:
----------------------------------
Summary: Create SearchArgument using column ids
Key: ORC-960
URL: https://issues.apache.org/jira/browse/ORC-960
Project: ORC
Issue Type: New Feature
Components: C++
Reporter: Quanlong Huang
Currently, SearchArguments are created using column names, e.g. in orc/sargs/SearchArgument.hh
{code:cpp}
virtual SearchArgumentBuilder& lessThan(const std::string& column,
PredicateDataType type,
Literal literal) = 0;{code}
The name string is the leaf field name which can be duplicated if there are nested types, e.g.
{code:sql}
id int
s1 struct<id:int,name:string>
s2 struct<id:int,name:string>
{code}
There are 3 leaf columns using name 'id'. The current code of resolving the column name can only found the first matched one:
{code:cpp}
// find column id from column name
uint64_t SargsApplier::findColumn(const Type& type,
const std::string& colName) {
for (uint64_t i = 0; i != type.getSubtypeCount(); ++i) {
if (type.getFieldName(i) == colName) {
return type.getSubtype(i)->getColumnId();
} else {
uint64_t ret = findColumn(*type.getSubtype(i), colName);
if (ret != INVALID_COLUMN_ID) {
return ret;
}
}
}
return INVALID_COLUMN_ID;
}
{code}
[https://github.com/apache/orc/blob/2dcbd6281e2fbeeaf0ffe46aa3b78cd3df96ed62/c%2B%2B/src/sargs/SargsApplier.cc#L25]
Since what we need is actually the column id, let's provide intefaces for column ids directly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)