You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2023/12/07 15:00:00 UTC

[jira] [Created] (IMPALA-12605) ALTER TABLE SET PARTITION SPEC reuses field ids of old partition specs

Zoltán Borók-Nagy created IMPALA-12605:
------------------------------------------

             Summary: ALTER TABLE SET PARTITION SPEC reuses field ids of old partition specs
                 Key: IMPALA-12605
                 URL: https://issues.apache.org/jira/browse/IMPALA-12605
             Project: IMPALA
          Issue Type: Bug
            Reporter: Zoltán Borók-Nagy


Impala's ALTER TABLE SET PARTITION SPEC reuses field ids of old partition specs.

This can result in having collisions of partition fields.

Repro:

{noformat}
CREATE TABLE ice_t (i int, p int) PARTITIONED BY SPEC (TRUNCATE(10, p)) STORED BY ICEBERG;

ALTER TABLE ice_t SET PARTITION SPEC (TRUNCATE(100, p));
{noformat}

The latter ALTER TABLE statement will create another partition spec for the table, but the partition field will have the same field id as the old partition spec's field id.

Workaround for this is to use the VOID transform:
{noformat}
ALTER TABLE ice_t SET PARTITION SPEC (VOID(p), TRUNCATE(100, p));
{noformat}

But Impala should automatically assign new partition field ids in the new spec. This is especially true for Iceberg V2 tables, where last-partition-id is a required field in the metadata. The Iceberg library should handle partition evolution correctly, seems like we are using the wrong APIs for partition evolution.
For reference, Hive has the same ALTER TABLE SET PARTITION SPEC syntax, but it is able to correctly create the new partition spec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)