You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2019/07/26 07:57:05 UTC

[GitHub] [incubator-iceberg] aokolnychyi opened a new issue #317: Extend Iceberg metadata with SortOrder

aokolnychyi opened a new issue #317: Extend Iceberg metadata with SortOrder
URL: https://github.com/apache/incubator-iceberg/issues/317
 
 
   Here is a short summary of the [discussion](https://lists.apache.org/thread.html/7692ce3c1714f1733fa11aa632da2f8c0696b4ae64d04c6563f17e50@%3Cdev.iceberg.apache.org%3E) on the dev list:
   - Iceberg should allow users to define a sort order in its metadata that applies to partitions.
   - We should never assume the sort order is actually applied to all files in the table, as that would require rewriting data immediately when we change the sort order.
   - Sort orders might evolve and change over time. When this happens, existing files will not be rewritten. Query engines should follow the updated sort order during subsequent writes. As a result, files within a table or partition can be sorted differently at a given point in time.
   - We should be able to define a sort order even for unpartitioned tables, as opposed to current Spark tables that allow a sort order only for bucketed tables.
   - `SortOrder` should be separate from `PartitionSpec`.
   - `SortOrder` will rely on transformations to define complex sort orders.
   - Files will be annotated with `sort_order_id` instead of `sort_columns`. We keep the question of `file_ordinal` open for now.
   - To begin with, we will support asc/desc natural sort orders (UTF8 ordering for Strings).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org