You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2019/07/13 19:27:56 UTC

[GitHub] [incubator-iceberg] rdblue opened a new issue #280: Add persistent IDs to partition fields

rdblue opened a new issue #280: Add persistent IDs to partition fields
URL: https://github.com/apache/incubator-iceberg/issues/280
 
 
   Partition fields are assigned IDs for when they are stored in manifest files. ID assignment is done in [`PartitionSpec#partitionType()`](https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L104-L117). That assigns IDs for each field starting at 1000.
   
   This assignment scheme reuses IDs across partition specs. Because a manifest file is written for a single partition spec, this doesn't cause problems when multiple specs exist. But this causes problems in the `entries` and `files` metadata tables because the data file partition may have a different schema across manifest files, but reuse IDs.
   
   For example, if part of a table is partitioned by `days(ts)` and another part is partitioned by `hours(ts)`, both of these will show up in the `entries` table's `partition` struct with ID 1000.
   
   A simple solution is to assign partition field IDs starting at 1000 across all table specs and keep the last assigned ID in table metadata. This would ensure that partition tuples will be read correctly in metadata tables when a table has multiple partition specs. In the example above, `days(ts)` would be assigned ID 1000, and when the second partition spec is added, `hours(ts)` is assigned ID 1001.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org