You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "JonasJ-ap (via GitHub)" <gi...@apache.org> on 2023/05/04 04:53:59 UTC

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #7523: Python: Support Inferring Iceberg UUID type from parquet files

JonasJ-ap commented on code in PR #7523:
URL: https://github.com/apache/iceberg/pull/7523#discussion_r1184538670


##########
python/pyiceberg/io/pyarrow.py:
##########
@@ -612,14 +612,40 @@ def _get_field_doc(field: pa.Field) -> Optional[str]:
 
 
 class _ConvertToIceberg(PyArrowSchemaVisitor[Union[IcebergType, Schema]]):
+    def __init__(self, expected_schema: Optional[Schema] = None):
+        self.expected_schema = expected_schema
+
+    def cast_if_needed(self, field_id: int, field_type: IcebergType) -> IcebergType:

Review Comment:
   An alternative solution (a simpler one) may be adding allowing promoting from `FixedType(16)` to `UUIDType` here:https://github.com/apache/iceberg/blob/3db4e896587d95318a690979e97bab55db250e23/python/pyiceberg/schema.py#L1308-L1322
   
   by adding 
   ```python
   @promote.register(FixedType)
   def _(file_type: FixedType, read_type: IcebergType) -> IcebergType:
       if isinstance(read_type, UUIDType) and len(file_type) == 16:
           return read_type
       else:
           raise ResolveError(f"Cannot promote an {file_type} to {read_type}")
   ```
   
   My concern here is that the added promotion is not listed as a valid type promotions in [spec](https://iceberg.apache.org/spec/#schema-evolution and the effect of the added promotion may be too large. So currently I chose to narrow the fix to the visitor. Looking forward to hearing more thoughts on this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org