You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2023/01/02 14:02:00 UTC

[jira] [Commented] (IMPALA-10166) ALTER TABLE for Iceberg tables

    [ https://issues.apache.org/jira/browse/IMPALA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653609#comment-17653609 ] 

Zoltán Borók-Nagy commented on IMPALA-10166:
--------------------------------------------

[~davidxdh] , I don't think we should add support for such statements. Data files should be added to the Iceberg tables via Iceberg API, not just by placing them to the table directory.

I guess the use case here is to write the data files via some tool and place them into an Iceberg table's partition directory, then invoke RECOVER PARTITIONS to add the new data files to the table.

The problem with such data files is that they might not conform to Iceberg specifications, e.g. they might have missing field ids. Because of this even our LOAD DATA INPATH statement rewrites the data files: IMPALA-11339.

With Iceberg's advanced partitioning such as partition transforms it's not trivial to come up with the proper partition names and values for external tools. Also, the table might place its files outside the table location, e.g. see [ObjectStoreLocationProvider|https://github.com/apache/iceberg/blob/adecd8aa6f1e57974f115a2e33b4288eb044a4bd/core/src/main/java/org/apache/iceberg/LocationProviders.java#L106:16]

Deleted data files belonging to old snapshots might still remain under the table location. How are we going to differentiate between such files and the new files?

All in all, I think RECOVER PARTITION statement is problematic and not really compatible with Iceberg tables.

> ALTER TABLE for Iceberg tables
> ------------------------------
>
>                 Key: IMPALA-10166
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10166
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Sheng Wang
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 4.1.0
>
>
> Add support for ALTER TABLE operations for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org