You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/03/13 02:39:21 UTC

[GitHub] [iceberg] yyanyy commented on pull request #2275: Core: add schema id to snapshot and history entry

yyanyy commented on pull request #2275:
URL: https://github.com/apache/iceberg/pull/2275#issuecomment-797852639


   > I have some questions:
   > When we switch from v1 format to v2, and a new metadata file is written for an existing table, what schemas are written to the `schemas` list? And in the `snapshot-log`, what `schema-id` is written for the previous snapshots? (Is it not written, i.e., is null? or is it 0?)
   > In general, if we see a schema id of 0, does that ever represent a specific schema, or does that always represent some undetermined schema? Let me elaborate: (1) Will we ever see a `schema-id` of 0 in a metadata file and if so, does that refer to a unique schema? (2) In code, if we have an instance of a schema and its schemaId is 0, what are the semantics of that schemaId?
   
   Thank you for the review, and sorry for the delay responding!
   
   I think this change applies to v1 tables as well. When the engine starts to use a release with this change, the new `schemas` list will be written with the current schema and 0 as its default schema id. And in `snapshot-log`, previous snapshots will have null schema-id since they were not available when they were written. 
   
   0 is a valid schema-id and it will refer to a unique schema in metadata file; if there's no schema evolution after the table starts to write `schemas`, 0 will be assigned to the current schema. And in the code, since we only care about id during the interaction with table metadata, and throughout the process when schema class is used as various classes for doing projection etc, schemaId will always be 0, and that is just a default value and shouldn't be used. #2096 has some conversation around this, and this behavior is mentioned in [schema class](https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/Schema.java#L44-L45).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org