You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/11/08 10:12:00 UTC

[jira] [Commented] (IMPALA-10974) Impala cannot resolve columns of converted Iceberg table

    [ https://issues.apache.org/jira/browse/IMPALA-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440347#comment-17440347 ] 

ASF subversion and git services commented on IMPALA-10974:
----------------------------------------------------------

Commit b02c003138388cb2546938682c53dbda19118fb8 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b02c003 ]

IMPALA-10974: Impala cannot resolve columns of converted Iceberg table

When a regular Parquet/ORC table is converted to Iceberg via Hive,
only the Iceberg metadata files need to be created. The data files
can stay in place.

This causes problems when the data files don't have field ids for
the schema elements. Currently Impala resolves columns in data
files based on Iceberg field ids, but since they are missing,
Impala raises an error or returns NULLs.

With this patch Impala falls back to the default column resolution
strategy when the data files lack field ids.

Testing:
 * added e2e tests both for Parquet and ORC

Change-Id: I85881b09891c7bd101e7a96e92561b70bbe5af41
Reviewed-on: http://gerrit.cloudera.org:8080/17953
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Impala cannot resolve columns of converted Iceberg table
> --------------------------------------------------------
>
>                 Key: IMPALA-10974
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10974
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 4.1.0
>
>
> When a regular Parquet/ORC table is converted to Iceberg via Hive, only the Iceberg metadata files need to be created. The data files can stay in place.
> This causes problems when the data files don't have field ids for the schema elements. Currently Impala resolves columns in data files based on Iceberg field ids, but since they are missing, Impala raises an error or returns NULLs.
> We could fallback to the default column resolution strategy when the data files lack field ids.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org