You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/27 12:27:56 UTC

[GitHub] [iceberg] dixingxing0 opened a new issue #2528: Add `freshIdentifierFieldIds` in TableMetadata

dixingxing0 opened a new issue #2528:
URL: https://github.com/apache/iceberg/issues/2528


   Follow-up for https://github.com/apache/iceberg/pull/2465
   I've tested PR-2465, i found that we should add `freshIdentifierFieldIds` like what `freshSortOrder` did.
   @jackye1995  @openinx 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #2528: Should fresh `IdentifierFieldIds` (Follow-up for PR2465)

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on issue #2528:
URL: https://github.com/apache/iceberg/issues/2528#issuecomment-828086527


   Sorry I think I misunderstood what you meant a bit. So I agree that the identifier should be recreated after a new struct is created in `TypeUtil#assignFreshIds`. I thought you were talking about refreshing the identifier at the same place as sort order in `TableMetadat#buildReplacement`, sorry for the misunderstanidng.
   
   The new `Schema` created in that method should instead use the constructor `new Schema(struct, identfierFieldIds)` instead of the current `new Schema(struct)`, and the identifier will be refreshed at that point of time. That will be done as a part of the subsequent PR, because there are many places using the schema constructors, I plan to change them in a single pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #2528: Should fresh `IdentifierFieldIds` (Follow-up for PR2465)

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on issue #2528:
URL: https://github.com/apache/iceberg/issues/2528#issuecomment-827756309


   I thought about that, I think the situations are a bit different between the 2 cases.
   
   For sort order and partition spec, it has an internal reference of the updated schema which is defined only locally by the caller. So new columns can be assigned any ID. When building the replacement based on the last schema, the `assignFreshIds` is called to assign the correct ID to each new column. Then the spec and order needs to be updated with those information.
   
   For identifier, it is always referencing the schema it sits in. The identifier field ID is always resolved after the actual schema is built, so the field ID is already fresh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx closed issue #2528: Should fresh `IdentifierFieldIds` (Follow-up for PR2465)

Posted by GitBox <gi...@apache.org>.
openinx closed issue #2528:
URL: https://github.com/apache/iceberg/issues/2528


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dixingxing0 commented on issue #2528: Should fresh `IdentifierFieldIds` (Follow-up for PR2465)

Posted by GitBox <gi...@apache.org>.
dixingxing0 commented on issue #2528:
URL: https://github.com/apache/iceberg/issues/2528#issuecomment-828088687


   Yes, i've changed the comment to avoid misleading, i'm looking forward to see the subsequent PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 edited a comment on issue #2528: Should fresh `IdentifierFieldIds` (Follow-up for PR2465)

Posted by GitBox <gi...@apache.org>.
jackye1995 edited a comment on issue #2528:
URL: https://github.com/apache/iceberg/issues/2528#issuecomment-828086527


   Sorry I think I misunderstood what you meant a bit. So I agree that the identifier should be recreated after a new struct is created in `TypeUtil#assignFreshIds`. I thought you were talking about refreshing the identifier at the same place as sort order in `TableMetadat#buildReplacement`, sorry for the misunderstanding.
   
   The new `Schema` created in that method should use the constructor `new Schema(struct, identfierFieldIds)` instead of the current `new Schema(struct)`, and the identifier will be refreshed at that point of time. That will be done as a part of the subsequent PR, because there are many places using the schema constructors, I plan to change them in a single pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org