You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/06/03 00:14:00 UTC

[jira] [Commented] (IMPALA-9410) Support resolving ORC file columns by names

    [ https://issues.apache.org/jira/browse/IMPALA-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545697#comment-17545697 ] 

ASF subversion and git services commented on IMPALA-9410:
---------------------------------------------------------

Commit decb46aa0dee43b735ca8b193bd1728dc42d702c in impala's branch refs/heads/master from Gergely Fürnstáhl
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=decb46aa0 ]

IMPALA-9410: Support resolving ORC file columns by names

Added query option and implementation to be able to resolve columns by
names.

Changed secondary resolution strategy for iceberg orc tables to name
based resolution.

Testing:

Added new test dimension for orc tests, added results to now working
iceberg migrated table test

Change-Id: I29562a059160c19eb58ccea76aa959d2e408f8de
Reviewed-on: http://gerrit.cloudera.org:8080/18397
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Support resolving ORC file columns by names
> -------------------------------------------
>
>                 Key: IMPALA-9410
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9410
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Gergely Fürnstáhl
>            Priority: Major
>              Labels: orc
>
> Currently we resolve ORC file columns by indices. We should provide an query option like PARQUET_FALLBACK_SCHEMA_RESOLUTION for Parquet (IMPALA-2835), to resolve ORC file columns by names.
> Note that Hive only writes column names to ORC files after Hive-2.x (HIVE-4243). For older versions of Hive, the column names in ORC files are something like _col0, _col1,....,_col99. So this feature is only required when deployed with Hive 2+.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org