You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Bhavyam Kamal <bh...@dremio.com> on 2021/07/21 11:49:48 UTC
Proposal: Z-Ordering in Iceberg
Hi Everyone,
I would like to discuss and get feedback on the following proposal for
Z-Ordering in the Iceberg Sync today:
https://docs.google.com/document/d/1UfGxaB7qlrGzzMk9pBm03oKPOkm-jk-NQVQQvHP-0Bc/edit?usp=sharing
Please let me know if you have any thoughts or suggestions by adding
comments in the doc.
Thanks and regards,
Bhavyam
Re: Proposal: Z-Ordering in Iceberg
Posted by Russell Spitzer <ru...@gmail.com>.
Yep! We discussed this yesterday.
The general plan going forward will be
Phase 1:
Merge Sort based compaction
Allow compaction/rewrite of data files using a space filling curve based sort. No planning or persisting of metrics.
Phase 2:
Support for Transforms with multiple arguments and possible parameterization
Store and metrics for curve values in datafile metrics along with transform used when writing file
Query planning using these metrics.
In my mind the final picture looks like
DataFileMetrics { zMax = ?, zMin = ?, sortOrder = 1)
Table Metadata {
SortOrder 1 = "HilbertCurve(x, y, z) + Options { }"
SortOrder 2 = "ZOrder(x,y) + Options(y using 128 bytes)"
}
Or something like that. This way for any given data file we can generate filters based on the ordering function used for a particular data file and we can update our definitions of functions over time etc ...
I think the main spec change here is figuring out how to store these transforms with more information (and multiple args)
> On Jul 22, 2021, at 8:37 AM, Piotr Findeisen <pi...@starburstdata.com> wrote:
>
> Hi Bhavyam,
>
> Has this been discussed on the sync?
> Ryan, will it be making into the table metadata spec?
>
> Best,
> PF
>
> On Wed, Jul 21, 2021 at 1:50 PM Bhavyam Kamal <bhavyam.kamal@dremio.com <ma...@dremio.com>> wrote:
> Hi Everyone,
>
> I would like to discuss and get feedback on the following proposal for Z-Ordering in the Iceberg Sync today:
>
> https://docs.google.com/document/d/1UfGxaB7qlrGzzMk9pBm03oKPOkm-jk-NQVQQvHP-0Bc/edit?usp=sharing <https://docs.google.com/document/d/1UfGxaB7qlrGzzMk9pBm03oKPOkm-jk-NQVQQvHP-0Bc/edit?usp=sharing>
>
> Please let me know if you have any thoughts or suggestions by adding comments in the doc.
>
> Thanks and regards,
> Bhavyam
>
Re: Proposal: Z-Ordering in Iceberg
Posted by Piotr Findeisen <pi...@starburstdata.com>.
Hi Bhavyam,
Has this been discussed on the sync?
Ryan, will it be making into the table metadata spec?
Best,
PF
On Wed, Jul 21, 2021 at 1:50 PM Bhavyam Kamal <bh...@dremio.com>
wrote:
> Hi Everyone,
>
> I would like to discuss and get feedback on the following proposal for
> Z-Ordering in the Iceberg Sync today:
>
>
> https://docs.google.com/document/d/1UfGxaB7qlrGzzMk9pBm03oKPOkm-jk-NQVQQvHP-0Bc/edit?usp=sharing
>
> Please let me know if you have any thoughts or suggestions by adding
> comments in the doc.
>
> Thanks and regards,
> Bhavyam
>
>