You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2022/07/13 16:00:00 UTC

[jira] [Updated] (IMPALA-11428) Honour Iceberg sort orders when writing a table

     [ https://issues.apache.org/jira/browse/IMPALA-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoltán Borók-Nagy updated IMPALA-11428:
---------------------------------------
    Description: 
Iceberg specification defines {{{}sort orders{}}}. We should consider this when writing to an Iceberg table through Impala.

See: [https://iceberg.apache.org/spec/#sort-orders]

We can ignore Z-order for now.

Currently we only add Iceberg partition expressions during INSERTs: [https://github.com/apache/impala/blob/26438d8e3e2cecfdab82643fcee7553df50198ca/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L866]

Impala's SORT BY clauses always sort the files in ascending order, Iceberg allows more control over it.

  was:
Iceberg specification defines {{{}sort orders{}}}. We should consider this when writing to an Iceberg table through Impala.

See: [https://iceberg.apache.org/spec/#sort-orders]

Currently we only add Iceberg partition expressions during INSERTs: [https://github.com/apache/impala/blob/26438d8e3e2cecfdab82643fcee7553df50198ca/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L866]

There might be some discrepancy between Impala's and Iceberg's sort orders. E.g. Impala always sort the files by ascending order, while Iceberg let users specify the direction. It also let users specify NULLS first vs NULLS last.

Z-order is not part of the Iceberg-spec currently, but it will be in the future. There will be differences there as well.


> Honour Iceberg sort orders when writing a table
> -----------------------------------------------
>
>                 Key: IMPALA-11428
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11428
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> Iceberg specification defines {{{}sort orders{}}}. We should consider this when writing to an Iceberg table through Impala.
> See: [https://iceberg.apache.org/spec/#sort-orders]
> We can ignore Z-order for now.
> Currently we only add Iceberg partition expressions during INSERTs: [https://github.com/apache/impala/blob/26438d8e3e2cecfdab82643fcee7553df50198ca/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L866]
> Impala's SORT BY clauses always sort the files in ascending order, Iceberg allows more control over it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org