You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/06/02 15:32:35 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #2654: Update spec for v2 changes

rdblue commented on a change in pull request #2654:
URL: https://github.com/apache/iceberg/pull/2654#discussion_r644081533



##########
File path: site/docs/spec.md
##########
@@ -495,10 +528,12 @@ Table metadata consists of the following fields:
 |            | _required_ | **`last-sequence-number`**| The table's highest assigned sequence number, a monotonically increasing long that tracks the order of snapshots in a table. |
 | _required_ | _required_ | **`last-updated-ms`**| Timestamp in milliseconds from the unix epoch when the table was last updated. Each table metadata file should update this field just before writing. |
 | _required_ | _required_ | **`last-column-id`**| An integer; the highest assigned column ID for the table. This is used to ensure columns are always assigned an unused ID when evolving schemas. |
-| _required_ | _required_ | **`schema`**| The table’s current schema. |
+| _required_ | _required_ | **`schema`**| The table’s current schema. In v2, this must be the schema identified by the `current-schema-id`. |
+| _optional_ | _required_ | **`schemas`**| A list of schemas, stored as objects with `schema-id`. |
+| _optional_ | _required_ | **`current-schema-id`**| ID of the table's current schema. |
 | _required_ |            | **`partition-spec`**| The table’s current partition spec, stored as only fields. Note that this is used by writers to partition data, but is not used when reading because reads use the specs stored in manifest files. (**Deprecated**: use `partition-specs` and `default-spec-id`instead ) |
 | _optional_ | _required_ | **`partition-specs`**| A list of partition specs, stored as full partition spec objects. |
-| _optional_ | _required_ | **`default-spec-id`**| ID of the “current” spec that writers should use by default. |
+| _optional_ | _required_ | **`default-spec-id`**| ID of the "current" spec that writers should use by default. |

Review comment:
       This distinction is on purpose. A table can have only one schema and that is the current one. We track old schemas for older snapshots. But a table can have multiple valid partition specs and it is fine to write new data into either one. That's why we track a "default" spec to use when writing if you aren't doing something that overrides it like migrating data from one spec to another.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org