You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/01 00:52:13 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #4016: types use new for generics. no metaprogramming

rdblue commented on a change in pull request #4016:
URL: https://github.com/apache/iceberg/pull/4016#discussion_r796187007



##########
File path: python/src/iceberg/types.py
##########
@@ -15,61 +15,84 @@
 # specific language governing permissions and limitations
 # under the License.
 """Data types used in describing Iceberg schemas
-
 This module implements the data types described in the Iceberg specification for Iceberg schemas. To
 describe an Iceberg table schema, these classes can be used in the construction of a StructType instance.
-
 Example:
     >>> StructType(
         [
             NestedField(True, 1, "required_field", StringType()),
             NestedField(False, 2, "optional_field", IntegerType()),
         ]
     )
-
 Notes:
   - https://iceberg.apache.org/#spec/#primitive-types
 """
 
-from typing import Optional
+from typing import Any, Dict, List, Optional, Tuple
 
 
-class Type:
-    def __init__(self, type_string: str, repr_string: str, is_primitive=False):
-        self._type_string = type_string
-        self._repr_string = repr_string
-        self._is_primitive = is_primitive
+class IcebergType:
+
+    _implemented: Dict[Tuple[str, Tuple[Any]], "IcebergType"] = {}
+
+    def __new__(cls, *args, **kwargs):

Review comment:
       I can understand wanting to have one implementation for all classes, but I think that it is likely that the code would be more maintainable if it were split out because you wouldn't need to handle all possible values passed to `__init__`.
   
   For example, adding this works for all of the non-parameterized primitive types:
   
   ```python
   # not modified other than this addition
   class Type:
       _instance = None
       def __new__(cls, *args, **kwargs):
           if not getattr(cls, "_instance"):
               cls._instance = super().__new__(cls, *args, **kwargs)
           return cls._instance
   ```
   ```python
   In [4]: i = test_types.IntegerType()
   
   In [5]: i == test_types.IntegerType()
   Out[5]: True
   
   In [6]: i == test_types.FloatType()
   Out[6]: False
   
   In [7]: test_types.FloatType()
   Out[7]: FloatType()
   
   In [8]: f = test_types.FloatType()
   
   In [9]: f == test_types.FloatType()
   Out[9]: True
   
   In [10]: i == f
   Out[10]: False
   ```
   
   I think it would be much more straightforward to override `__new__` separately. Then because you control each `__init__` method, you know that you don't have to handle default args or kwargs or things that make this so complex.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org