You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/01/21 08:58:00 UTC

[jira] [Commented] (IMPALA-10432) INSERT INTO Iceberg tables with partition transforms

    [ https://issues.apache.org/jira/browse/IMPALA-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269137#comment-17269137 ] 

ASF subversion and git services commented on IMPALA-10432:
----------------------------------------------------------

Commit 90f3b2f4918e6a91e369a97ce47816c9406ac71c in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=90f3b2f ]

IMPALA-10432: INSERT INTO Iceberg tables with partition transforms

INSERT INTO Iceberg tables that use partition transforms. Partition
transforms are functions that calculate partition data from row data.

There are the following partition transforms in Iceberg:
https://iceberg.apache.org/spec/#partition-transforms

 * IDENTITY
 * BUCKET
 * TRUNCATE
 * YEAR
 * MONTH
 * DAY
 * HOUR

INSERT INTO identity-partitioned Iceberg tables are already supported.
This patch adds support for the rest of the transforms.

We create the partitioning expressions in InsertStmt. Based on these
expressions data are automatically shuffled and sorted by the backend
executors before rows are given to the table sink operators. The table
sink operator writes the partitions one-by-one and creates a
human-readable partition path for them.

In the end, we will convert the partition path to partition data and
create Iceberg DataFiles with information about the files written.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I3edf02048cea78703837b248c55219c22d512b78
Reviewed-on: http://gerrit.cloudera.org:8080/16939
Reviewed-by: wangsheng <sk...@163.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> INSERT INTO Iceberg tables with partition transforms
> ----------------------------------------------------
>
>                 Key: IMPALA-10432
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10432
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> INSERT INTO Iceberg tables that use partition transforms. Partition transforms are functions that calculate partition data from row data.
> There are the following partition transforms in Iceberg:
> [https://iceberg.apache.org/spec/#partition-transforms]
>  * IDENTITY
>  * BUCKET
>  * TRUNCATE
>  * YEAR
>  * MONTH
>  * DAY
>  * HOUR
> INSERT INTO identity-partitioned Iceberg tables are already supported.
> We need to add support for the rest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org