You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/12/10 21:04:30 UTC

[GitHub] [iceberg] CircArgs opened a new pull request #3714: Types literals all in one primitives pr

CircArgs opened a new pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714


   This is based on PR [#3601](https://github.com/apache/iceberg/pull/3601). Please reference that PR for some of the existing discussion. It was requested that I split the PR to make it more digestible. This is part 1 which includes the simpler types- Integral types, Floating types, String, essentially types that are not generic.
   
   This is meant to be followed by PR (TBD will update with sister PR link) which includes all tests (this PR includes a limited set of tests relevant to the included types) and the generic types


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3714: Types literals all in one primitives pr

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714#issuecomment-991957594


   @CircArgs, I think that the request to break down these changes wasn't just that there were too many types done in a single PR, it was that too many features were done in a single PR. That's what makes it hard to review. Could you do just one feature in a PR?
   
   For example, I know you'd like to clean up type classes so that you can use `isinstance` the same way for all of them. Could you do that in a single PR? I don't think that we want to add literals and change types in the same PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on pull request #3714: Types literals all in one primitives pr

Posted by GitBox <gi...@apache.org>.
samredai commented on pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714#issuecomment-991962318


   @rdblue that's my bad, I suggested splitting out the primitives and generics since the primitives don't use any of the metaclass logic in the original PR. I'm still learning the right chunk size, would something like the below work better where each line item is a single PR and they're opened in this sequence?
   
   1. Converting instances to classes in types.py
   2. Primitives:
     a. Allowing types to be instantiated with a value (combining literals with types)
     b. Byte casting and hashing
     c. Conversions between types
     d. Comparisons between types
   3. Generics:
     a. Allowing types to be instantiated with a value, (this would include some of the metaclass logic)
     b. Byte casting and hashing
     c. Conversions between types
     d. Comparisons between types
   
   @CircArgs let me know if I've missed any features in the original PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on pull request #3714: Types literals all in one primitives pr

Posted by GitBox <gi...@apache.org>.
CircArgs commented on pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714#issuecomment-1004398192


   Opening small incremental PRs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a change in pull request #3714: Types literals all in one primitives pr

Posted by GitBox <gi...@apache.org>.
samredai commented on a change in pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714#discussion_r767231697



##########
File path: python/src/iceberg/types.py
##########
@@ -15,157 +15,665 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional
+import struct
+from base64 import b64encode
+from datetime import date, datetime, time
+from decimal import Decimal as PythonDecimal
+from typing import Dict, Optional, Tuple, Type, Union
+from uuid import UUID as PythonUUID
+
+import mmh3
+from numpy import float32, float64, isinf, isnan
+
+
+class IcebergType:
+    """Base type for all Iceberg Types"""
+
+    def __setattr__(self, key, value):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be altered on isntance {self}")
+        object.__setattr__(self, key, value)
+
+    # freeze deleting of generic attributes
+    def __delattr__(self, key):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be deleted on instance {self}")
+        object.__delattr__(self, key)
+
+    @classmethod
+    def can_cast(cls, _type: Type["IcebergType"]):

Review comment:
       Do you mind adding a one-liner docstring here?

##########
File path: python/src/iceberg/types.py
##########
@@ -15,157 +15,665 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional
+import struct
+from base64 import b64encode
+from datetime import date, datetime, time
+from decimal import Decimal as PythonDecimal
+from typing import Dict, Optional, Tuple, Type, Union
+from uuid import UUID as PythonUUID
+
+import mmh3
+from numpy import float32, float64, isinf, isnan
+
+
+class IcebergType:
+    """Base type for all Iceberg Types"""
+
+    def __setattr__(self, key, value):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be altered on isntance {self}")
+        object.__setattr__(self, key, value)
+
+    # freeze deleting of generic attributes
+    def __delattr__(self, key):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be deleted on instance {self}")
+        object.__delattr__(self, key)
+
+    @classmethod
+    def can_cast(cls, _type: Type["IcebergType"]):
+        return cls == _type
+
+
+class PrimitiveType(IcebergType):
+    """
+    base type for primitives `IcebergType`s
+    Primitives include an instance attribute `value` which is used as the underlying value to work with the type
+    a `PrimitiveType` should type the instance `value` most specific to that type
+    """
+
+    value: Union[
+        bytes,
+        bool,
+        int,
+        float32,
+        float64,
+        PythonDecimal,
+        str,
+        dict,
+        PythonUUID,
+        date,
+        time,
+        datetime,
+    ]
+
+    def __init__(self, value):
+
+        if issubclass(type(value), PrimitiveType):
+            try:
+                self.value = value.to(type(self)).value
+            except AttributeError:
+                raise TypeError(f"Cannot convert {value} to type {type(self)}")
+        else:
+            self.value = value
+
+    def __repr__(self) -> str:
+        return f"{repr(type(self))}(value={self.value})"  # type: ignore
+
+    def __str__(self) -> str:
+        return f"{str(type(self))}({self.value})"  # type: ignore
+
+    def __bool__(self) -> bool:
+        return bool(self.value)
+
+    def __eq__(self, other) -> bool:
+        return type(other) == type(self) and self.value == other.value
+
+    def __bytes__(self) -> bytes:
+        return type(self).to_bytes(self.value)
+
+    @classmethod
+    def to_bytes(cls, value):
+        return bytes(value)
+
+    def __hash__(self) -> int:
+        """https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements"""
+        return type(self).hash(self.value)
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(cls.to_bytes(value))
+
+    def to(self, _type: Type["PrimitiveType"], coerce: bool = False):
+        if type(self).can_cast(_type) or coerce:
+            return _type(self.value)
+        raise TypeError(f"Cannot cast {type(self)} to {_type}.")
+
+
+class Number(PrimitiveType):
+    """
+    base `PrimitiveType` for `IcebergType`s for numeric types
+    per https://iceberg.apache.org/#spec/#primitive-types these include int, long, float, double, decimal
+    """
+
+    value: Union[int, float32, float64, PythonDecimal]
+
+    def __float__(self) -> float:
+        return float(self.value)
+
+    def __int__(self) -> int:
+        return int(self.value)
+
+    def __math(self, op, other=None):
+        op_f = getattr(self.value, op)
+        try:
+            if op in (
+                "__add__",
+                "__sub__",
+                "__div__",
+                "__mul__",
+            ):
+                other = other.to(type(self))
+                return type(self)(op_f(other.value))
+            if op in (
+                "__pow__",
+                "__mod__",
+            ):
+                other = type(self)(other)
+                return type(self)(op_f(other.value))
+            if op in ("__lt__", "__eq__"):
+                other = other.to(type(self))
+                return op_f(other.value)
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        if op in ("__neg__", "__abs__"):
+            return type(self)(op_f())
+
+    def __add__(self, other: "Number") -> "Number":
+        return self.__math("__add__", other)
+
+    def __sub__(self, other: "Number") -> "Number":
+        return self.__math("__sub__", other)
+
+    def __mul__(self, other: "Number") -> "Number":
+        return self.__math("__mul__", other)
 
+    def __div__(self, other: "Number") -> "Number":
+        return self.__math("__div__", other)
 
-class Type(object):
-    def __init__(self, type_string: str, repr_string: str, is_primitive=False):
-        self._type_string = type_string
-        self._repr_string = repr_string
-        self._is_primitive = is_primitive
+    def __neg__(self) -> "Number":
+        return self.__math("__neg__")
+
+    def __abs__(self) -> "Number":
+        return self.__math("__abs__")
 
-    def __repr__(self):
-        return self._repr_string
+    def __pow__(self, other: "Number", mod: Optional["Number"] = None) -> "Number":
+        return self.__math("__pow__", other)
 
-    def __str__(self):
-        return self._type_string
+    def __mod__(self, other: "Number") -> "Number":
+        return self.__math("__mod__", other)
 
-    @property
-    def is_primitive(self) -> bool:
-        return self._is_primitive
+    def __lt__(self, other) -> bool:
+        return self.__math("__lt__", other)
 
+    def __eq__(self, other) -> bool:
+        return self.__math("__eq__", other) and self._neg == other._neg  # type: ignore
+
+    def __gt__(self, other) -> bool:
+        return not self.__le__(other)
+
+    def __le__(self, other) -> bool:
+        return self.__lt__(other) or self.__eq__(other)
+
+    def __ge__(self, other) -> bool:
+        return self.__gt__(other) or self.__eq__(other)
+
+    def __hash__(self) -> int:
+        return super().__hash__()
+
+
+class Integral(Number):
+    """base class for integral types Integer, Long
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Integer and Long
+    """
+
+    value: int
+    _neg: bool
+    _frozen_attrs = {"min", "max", "_neg"}
+
+    def __init__(self, value: Union[str, float, int]):
+        super().__init__(value)
+
+        if isinstance(self.value, Number):
+            self.value = int(self.value.value)
+        else:
+            self.value = int(self.value)
+        self._check()
+        object.__setattr__(self, "_neg", self.value < 0)
+
+    def _check(self) -> "Integral":
+        """
+        helper method for `Integal` specific `_check` to ensure value is within spec
+        """
+        if self.value > self.max:  # type: ignore
+            raise ValueError(f"{type(self)} must be less than or equal to {self.max}")  # type: ignore
+
+        if self.value < self.min:  # type: ignore
+            raise ValueError(
+                f"{type(self)} must be greater than or equal to {self.min}"  # type: ignore
+            )
+
+        return self
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("q", value)
+
+
+class Floating(Number):
+    """base class for floating types Float, Double
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Float and Double
+    """
+
+    _neg: bool
+    _frozen_attrs = {"_neg"}
+
+    def __init__(self, float_t, value: Union[float, str, int]):
+        super().__init__(value)
+        object.__setattr__(self, "_neg", str(self.value).strip().startswith("-"))
+        if isinstance(self.value, Number):
+            self.value = float_t(self.value.value)
+        else:
+            self.value = float_t(self.value)
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("d", value)
+
+    def __repr__(self) -> str:
+        ret = super().__repr__()
+        if self._neg and isnan(self.value):  # type: ignore
+            return ret.replace("nan", "-nan")
+        return ret
+
+    def is_nan(self) -> bool:
+        return isnan(self.value)  # type: ignore
+
+    def is_inf(self) -> bool:
+        return isinf(self.value)  # type: ignore
+
+    def __str__(self) -> str:
+        ret = super().__str__()
+        if self._neg and self.is_nan():
+            return ret.replace("nan", "-nan")
+        if self._neg and self.value == 0.0:
+            return ret.replace("0.0", "-0.0")
+        return ret
+
+    def __lt__(self, other: "Floating") -> bool:
+        try:
+            other = other.to(type(self))
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        def get_key(x) -> str:

Review comment:
       It took me a little while to understand this. It makes sense to me but can you add a short docstring explaining it? You can also explain the `ret_dict` lookup here too.

##########
File path: python/tests/test_types.py
##########
@@ -15,149 +15,196 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import math
+
 import pytest
 
 from iceberg.types import (
-    BinaryType,
-    BooleanType,
-    DateType,
-    DecimalType,
-    DoubleType,
-    FixedType,
-    FloatType,
-    IntegerType,
-    ListType,
-    LongType,
-    MapType,
-    NestedField,
-    StringType,
-    StructType,
-    TimestampType,
-    TimestamptzType,
-    TimeType,
-    UUIDType,
+    UUID,
+    Binary,
+    Boolean,
+    Date,
+    Double,
+    Float,
+    Integer,
+    Long,
+    String,
+    Time,
+    Timestamp,
+    Timestamptz,
+)
+
+
+# https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements

Review comment:
       Instead of a standalone comment, would this be better as a docstring to the `test_hashing` function?

##########
File path: python/src/iceberg/types.py
##########
@@ -15,157 +15,665 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional
+import struct
+from base64 import b64encode
+from datetime import date, datetime, time
+from decimal import Decimal as PythonDecimal
+from typing import Dict, Optional, Tuple, Type, Union
+from uuid import UUID as PythonUUID
+
+import mmh3
+from numpy import float32, float64, isinf, isnan
+
+
+class IcebergType:
+    """Base type for all Iceberg Types"""
+
+    def __setattr__(self, key, value):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be altered on isntance {self}")
+        object.__setattr__(self, key, value)
+
+    # freeze deleting of generic attributes
+    def __delattr__(self, key):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be deleted on instance {self}")
+        object.__delattr__(self, key)
+
+    @classmethod
+    def can_cast(cls, _type: Type["IcebergType"]):
+        return cls == _type
+
+
+class PrimitiveType(IcebergType):
+    """
+    base type for primitives `IcebergType`s
+    Primitives include an instance attribute `value` which is used as the underlying value to work with the type
+    a `PrimitiveType` should type the instance `value` most specific to that type
+    """
+
+    value: Union[
+        bytes,
+        bool,
+        int,
+        float32,
+        float64,
+        PythonDecimal,
+        str,
+        dict,
+        PythonUUID,
+        date,
+        time,
+        datetime,
+    ]
+
+    def __init__(self, value):
+
+        if issubclass(type(value), PrimitiveType):
+            try:
+                self.value = value.to(type(self)).value
+            except AttributeError:
+                raise TypeError(f"Cannot convert {value} to type {type(self)}")
+        else:
+            self.value = value
+
+    def __repr__(self) -> str:
+        return f"{repr(type(self))}(value={self.value})"  # type: ignore
+
+    def __str__(self) -> str:
+        return f"{str(type(self))}({self.value})"  # type: ignore
+
+    def __bool__(self) -> bool:
+        return bool(self.value)
+
+    def __eq__(self, other) -> bool:
+        return type(other) == type(self) and self.value == other.value
+
+    def __bytes__(self) -> bytes:
+        return type(self).to_bytes(self.value)
+
+    @classmethod
+    def to_bytes(cls, value):
+        return bytes(value)
+
+    def __hash__(self) -> int:
+        """https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements"""
+        return type(self).hash(self.value)
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(cls.to_bytes(value))
+
+    def to(self, _type: Type["PrimitiveType"], coerce: bool = False):
+        if type(self).can_cast(_type) or coerce:
+            return _type(self.value)
+        raise TypeError(f"Cannot cast {type(self)} to {_type}.")
+
+
+class Number(PrimitiveType):
+    """
+    base `PrimitiveType` for `IcebergType`s for numeric types
+    per https://iceberg.apache.org/#spec/#primitive-types these include int, long, float, double, decimal
+    """
+
+    value: Union[int, float32, float64, PythonDecimal]
+
+    def __float__(self) -> float:
+        return float(self.value)
+
+    def __int__(self) -> int:
+        return int(self.value)
+
+    def __math(self, op, other=None):
+        op_f = getattr(self.value, op)
+        try:
+            if op in (
+                "__add__",
+                "__sub__",
+                "__div__",
+                "__mul__",
+            ):
+                other = other.to(type(self))
+                return type(self)(op_f(other.value))
+            if op in (
+                "__pow__",
+                "__mod__",
+            ):
+                other = type(self)(other)
+                return type(self)(op_f(other.value))
+            if op in ("__lt__", "__eq__"):
+                other = other.to(type(self))
+                return op_f(other.value)
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        if op in ("__neg__", "__abs__"):
+            return type(self)(op_f())
+
+    def __add__(self, other: "Number") -> "Number":
+        return self.__math("__add__", other)
+
+    def __sub__(self, other: "Number") -> "Number":
+        return self.__math("__sub__", other)
+
+    def __mul__(self, other: "Number") -> "Number":
+        return self.__math("__mul__", other)
 
+    def __div__(self, other: "Number") -> "Number":
+        return self.__math("__div__", other)
 
-class Type(object):
-    def __init__(self, type_string: str, repr_string: str, is_primitive=False):
-        self._type_string = type_string
-        self._repr_string = repr_string
-        self._is_primitive = is_primitive
+    def __neg__(self) -> "Number":
+        return self.__math("__neg__")
+
+    def __abs__(self) -> "Number":
+        return self.__math("__abs__")
 
-    def __repr__(self):
-        return self._repr_string
+    def __pow__(self, other: "Number", mod: Optional["Number"] = None) -> "Number":
+        return self.__math("__pow__", other)
 
-    def __str__(self):
-        return self._type_string
+    def __mod__(self, other: "Number") -> "Number":
+        return self.__math("__mod__", other)
 
-    @property
-    def is_primitive(self) -> bool:
-        return self._is_primitive
+    def __lt__(self, other) -> bool:
+        return self.__math("__lt__", other)
 
+    def __eq__(self, other) -> bool:
+        return self.__math("__eq__", other) and self._neg == other._neg  # type: ignore
+
+    def __gt__(self, other) -> bool:
+        return not self.__le__(other)
+
+    def __le__(self, other) -> bool:
+        return self.__lt__(other) or self.__eq__(other)
+
+    def __ge__(self, other) -> bool:
+        return self.__gt__(other) or self.__eq__(other)
+
+    def __hash__(self) -> int:
+        return super().__hash__()
+
+
+class Integral(Number):
+    """base class for integral types Integer, Long
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Integer and Long
+    """
+
+    value: int
+    _neg: bool
+    _frozen_attrs = {"min", "max", "_neg"}
+
+    def __init__(self, value: Union[str, float, int]):
+        super().__init__(value)
+
+        if isinstance(self.value, Number):
+            self.value = int(self.value.value)
+        else:
+            self.value = int(self.value)
+        self._check()
+        object.__setattr__(self, "_neg", self.value < 0)
+
+    def _check(self) -> "Integral":
+        """
+        helper method for `Integal` specific `_check` to ensure value is within spec
+        """
+        if self.value > self.max:  # type: ignore
+            raise ValueError(f"{type(self)} must be less than or equal to {self.max}")  # type: ignore
+
+        if self.value < self.min:  # type: ignore
+            raise ValueError(
+                f"{type(self)} must be greater than or equal to {self.min}"  # type: ignore
+            )
+
+        return self
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("q", value)
+
+
+class Floating(Number):
+    """base class for floating types Float, Double
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Float and Double
+    """
+
+    _neg: bool
+    _frozen_attrs = {"_neg"}
+
+    def __init__(self, float_t, value: Union[float, str, int]):
+        super().__init__(value)
+        object.__setattr__(self, "_neg", str(self.value).strip().startswith("-"))
+        if isinstance(self.value, Number):
+            self.value = float_t(self.value.value)
+        else:
+            self.value = float_t(self.value)
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("d", value)
+
+    def __repr__(self) -> str:
+        ret = super().__repr__()
+        if self._neg and isnan(self.value):  # type: ignore
+            return ret.replace("nan", "-nan")
+        return ret
+
+    def is_nan(self) -> bool:
+        return isnan(self.value)  # type: ignore
+
+    def is_inf(self) -> bool:
+        return isinf(self.value)  # type: ignore
+
+    def __str__(self) -> str:
+        ret = super().__str__()
+        if self._neg and self.is_nan():
+            return ret.replace("nan", "-nan")
+        if self._neg and self.value == 0.0:
+            return ret.replace("0.0", "-0.0")
+        return ret
+
+    def __lt__(self, other: "Floating") -> bool:
+        try:
+            other = other.to(type(self))
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        def get_key(x) -> str:
+            if x.is_nan():
+                ret = "nan"
+            elif x.is_inf():
+                ret = "inf"
+            else:
+                return "value"
+            return ("-" if x._neg else "") + ret
+
+        ret_dict: Dict[Tuple[str, str], bool] = {
+            ("inf", "value"): False,
+            ("nan", "nan"): False,
+            ("-inf", "-inf"): False,
+            ("value", "inf"): True,
+            ("-inf", "-nan"): False,
+            ("-nan", "-nan"): False,
+            ("value", "-nan"): False,
+            ("-nan", "-inf"): True,
+            ("-inf", "inf"): True,
+            ("-nan", "nan"): True,
+            ("nan", "value"): False,
+            ("nan", "-nan"): False,
+            ("inf", "nan"): False,
+            ("-nan", "inf"): True,
+            ("inf", "inf"): False,
+            ("nan", "-inf"): False,
+            ("value", "value"): (self._neg and not other._neg) or (self.value < other.value),  # type: ignore
+            ("-nan", "value"): True,
+            ("value", "nan"): True,
+            ("-inf", "value"): True,
+            ("-inf", "nan"): True,
+            ("inf", "-inf"): False,
+            ("nan", "inf"): True,
+            ("value", "-inf"): False,
+            ("inf", "-nan"): False,
+        }
+        return ret_dict[(get_key(self), get_key(other))]
 
-class FixedType(Type):
-    def __init__(self, length: int):
-        super().__init__(
-            f"fixed[{length}]", f"FixedType(length={length})", is_primitive=True
-        )
-        self._length = length
 
-    @property
-    def length(self) -> int:
-        return self._length
+class Integer(Integral):
+    """32-bit signed integers: `int` from https://iceberg.apache.org/#spec/#primitive-types
+
+
+    Args:
+        value: value for which the integer will represent
 
+    Attributes:
+        value (int): the literal value contained by the `Integer`
+        max (int): the maximum value `Integer` may take on
+        min (int): the minimum value `Integer` may take on
 
-class DecimalType(Type):
-    def __init__(self, precision: int, scale: int):
-        super().__init__(
-            f"decimal({precision}, {scale})",
-            f"DecimalType(precision={precision}, scale={scale})",
-            is_primitive=True,
-        )
-        self._precision = precision
-        self._scale = scale
-
-    @property
-    def precision(self) -> int:
-        return self._precision
-
-    @property
-    def scale(self) -> int:
-        return self._scale
-
-
-class NestedField(object):
-    def __init__(
-        self,
-        is_optional: bool,
-        field_id: int,
-        name: str,
-        field_type: Type,
-        doc: Optional[str] = None,
-    ):
-        self._is_optional = is_optional
-        self._id = field_id
-        self._name = name
-        self._type = field_type
-        self._doc = doc
-
-    @property
-    def is_optional(self) -> bool:
-        return self._is_optional
-
-    @property
-    def is_required(self) -> bool:
-        return not self._is_optional
-
-    @property
-    def field_id(self) -> int:
-        return self._id
-
-    @property
-    def name(self) -> str:
-        return self._name
-
-    @property
-    def type(self) -> Type:
-        return self._type
-
-    def __repr__(self):
-        return (
-            f"NestedField(is_optional={self._is_optional}, field_id={self._id}, "
-            f"name={repr(self._name)}, field_type={repr(self._type)}, doc={repr(self._doc)})"
+    Examples:
+        >>> Integer(5)
+        Integer(value=5)
+
+        >>> Integer('3.14')
+        Integer(value=3)
+
+        >>> Integer(3.14)
+        Integer(value=3)
+
+    """
+
+    max: int = 2147483647
+    min: int = -2147483648
+
+    @classmethod
+    def can_cast(cls, _type):
+        return _type in (cls, Long)
+
+
+class Long(Integral):
+    """64-bit signed integers: `long` from https://iceberg.apache.org/#spec/#primitive-types
+
+
+    Args:
+        value: value for which the long will represent
+
+    Attributes:
+        value (int): the literal value contained by the `Long`
+        max (int): the maximum value `Long` may take on
+        min (int): the minimum value `Long` may take on
+
+    Examples:
+        >>> Long(5)
+        Long(value=5)
+
+        >>> Long('3.14')
+        Long(value=3)
+
+        >>> Long(3.14)
+        Long(value=3)
+    """
+
+    max: int = 9223372036854775807
+    min: int = -9223372036854775808
+
+
+class Float(Floating):
+    """32-bit IEEE 754 floating point: `float` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value: value for which the float will represent
+
+     Examples:
+        >>> Float(5)
+        Float(value=5.0)
+
+        >>> Float(3.14)
+        Float(value=3)
+    """
+
+    # float32 ensures spec
+    value: float32
+
+    def __init__(self, value):
+        super().__init__(float32, value)
+
+    @classmethod
+    def can_cast(cls, _type):
+        return _type in (cls, Double)
+
+
+class Double(Floating):
+    """64-bit IEEE 754 floating point: `double` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value: value for which the double will represent
+
+    Examples:
+        >>> Double(5)
+        Double(value=5.0)
+
+        >>> Double(3.14)
+        Double(value=3)
+
+    """
+
+    # float64 ensures spec
+    value: float64
+
+    def __init__(self, value):
+        super().__init__(float64, value)
+
+
+class Boolean(PrimitiveType):
+    """`boolean` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+            value (bool): value the boolean will represent
+
+    Examples:
+            >>>Boolean(True)
+            Boolean(value=True)
+    """
+
+    value: bool
+
+    def __bool__(self):
+        return self.value
+
+    @classmethod
+    def to_bytes(self, value) -> bytes:
+        return Integer.to_bytes(value)
+
+    def __hash__(self) -> int:
+        return super().__hash__()
+
+    def __eq__(self, other) -> bool:
+        return isinstance(other, Boolean) and self.value == other.value
+
+
+class String(PrimitiveType):
+    """Arbitrary-length character sequences Encoded with UTF-8: `string` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value (str): value the string will represent
+
+    Attributes:
+        value (str): the literal value contained by the `String`
+
+    Examples:
+        >>> String("Hello")
+        String(value='Hello')
+    """
+
+    value: str
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(value)
+
+
+class UUID(PrimitiveType):
+    """Universally unique identifiers: `uuid` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value: value the uuid will represent
+
+    Attributes:
+        value (uuid.UUID): literal value contained by the `UUID`
+
+    Examples:
+        >>> UUID("f79c3e09-677c-4bbd-a479-3f349cb785e7")
+        UUID(value=f79c3e09-677c-4bbd-a479-3f349cb785e7)
+    """
+
+    value: PythonUUID
+
+    def __init__(self, value: Union[str, PythonUUID]):
+        super().__init__(value)
+        if not isinstance(self.value, PythonUUID):
+            self.value = PythonUUID(self.value)
+
+    def __int__(self) -> int:
+        return self.value.int
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        v = int(value.int)

Review comment:
       Shouldn't this include casting the value to a uuid if it's a string, like how the `__init__` does?:
   ```py
       @classmethod
       def to_bytes(cls, value) -> bytes:
           uuid = PythonUUID(value) if not isinstance(value, PythonUUID) else value
           v = int(uuid.int)
           return struct.pack(
               ">QQ",
               (v >> 64) & 0xFFFFFFFFFFFFFFFF,
               v & 0xFFFFFFFFFFFFFFFF,
           )
   ```
   Or maybe the class can just be instantiated here by changing the one line to `v = int(cls(value))`

##########
File path: python/src/iceberg/types.py
##########
@@ -15,157 +15,665 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional
+import struct
+from base64 import b64encode
+from datetime import date, datetime, time
+from decimal import Decimal as PythonDecimal
+from typing import Dict, Optional, Tuple, Type, Union
+from uuid import UUID as PythonUUID
+
+import mmh3
+from numpy import float32, float64, isinf, isnan
+
+
+class IcebergType:
+    """Base type for all Iceberg Types"""
+
+    def __setattr__(self, key, value):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be altered on isntance {self}")
+        object.__setattr__(self, key, value)
+
+    # freeze deleting of generic attributes
+    def __delattr__(self, key):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be deleted on instance {self}")
+        object.__delattr__(self, key)
+
+    @classmethod
+    def can_cast(cls, _type: Type["IcebergType"]):
+        return cls == _type
+
+
+class PrimitiveType(IcebergType):
+    """
+    base type for primitives `IcebergType`s
+    Primitives include an instance attribute `value` which is used as the underlying value to work with the type
+    a `PrimitiveType` should type the instance `value` most specific to that type
+    """
+
+    value: Union[
+        bytes,
+        bool,
+        int,
+        float32,
+        float64,
+        PythonDecimal,
+        str,
+        dict,
+        PythonUUID,
+        date,
+        time,
+        datetime,
+    ]
+
+    def __init__(self, value):
+
+        if issubclass(type(value), PrimitiveType):
+            try:
+                self.value = value.to(type(self)).value
+            except AttributeError:
+                raise TypeError(f"Cannot convert {value} to type {type(self)}")
+        else:
+            self.value = value
+
+    def __repr__(self) -> str:
+        return f"{repr(type(self))}(value={self.value})"  # type: ignore
+
+    def __str__(self) -> str:
+        return f"{str(type(self))}({self.value})"  # type: ignore
+
+    def __bool__(self) -> bool:
+        return bool(self.value)
+
+    def __eq__(self, other) -> bool:
+        return type(other) == type(self) and self.value == other.value
+
+    def __bytes__(self) -> bytes:
+        return type(self).to_bytes(self.value)
+
+    @classmethod
+    def to_bytes(cls, value):
+        return bytes(value)
+
+    def __hash__(self) -> int:
+        """https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements"""
+        return type(self).hash(self.value)
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(cls.to_bytes(value))
+
+    def to(self, _type: Type["PrimitiveType"], coerce: bool = False):
+        if type(self).can_cast(_type) or coerce:
+            return _type(self.value)
+        raise TypeError(f"Cannot cast {type(self)} to {_type}.")
+
+
+class Number(PrimitiveType):
+    """
+    base `PrimitiveType` for `IcebergType`s for numeric types
+    per https://iceberg.apache.org/#spec/#primitive-types these include int, long, float, double, decimal
+    """
+
+    value: Union[int, float32, float64, PythonDecimal]
+
+    def __float__(self) -> float:
+        return float(self.value)
+
+    def __int__(self) -> int:
+        return int(self.value)
+
+    def __math(self, op, other=None):

Review comment:
       I think this could benefit from a descriptive docstring. One thing it should mention is that all of the math operation magic methods for this class defer to this single method.

##########
File path: python/tests/test_types.py
##########
@@ -15,149 +15,196 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import math
+
 import pytest
 
 from iceberg.types import (
-    BinaryType,
-    BooleanType,
-    DateType,
-    DecimalType,
-    DoubleType,
-    FixedType,
-    FloatType,
-    IntegerType,
-    ListType,
-    LongType,
-    MapType,
-    NestedField,
-    StringType,
-    StructType,
-    TimestampType,
-    TimestamptzType,
-    TimeType,
-    UUIDType,
+    UUID,
+    Binary,
+    Boolean,
+    Date,
+    Double,
+    Float,
+    Integer,
+    Long,
+    String,
+    Time,
+    Timestamp,
+    Timestamptz,
+)
+
+
+# https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements
+@pytest.mark.parametrize(
+    "instance,expected",
+    [
+        (UUID("f79c3e09-677c-4bbd-a479-3f349cb785e7"), 1488055340),
+        (Boolean(True), 1392991556),
+        (Integer(34), 2017239379),
+        (Long(34), 2017239379),
+        (Float(1), -142385009),
+        (Double(1), -142385009),
+        (String("iceberg"), 1210000089),
+        (Binary(b"\x00\x01\x02\x03"), -188683207),
+        (Date("2017-11-16"), -653330422),
+        (Time(22, 31, 8), -662762989),
+        (Timestamp("2017-11-16T14:31:08-08:00"), -2047944441),
+        (Timestamptz("2017-11-16T14:31:08-08:00"), -2047944441),
+    ],
+)
+def test_hashing(instance, expected):
+    assert hash(instance) == expected
+
+
+def test_integer_under_overflows():
+    with pytest.raises(ValueError):
+        Integer(2 ** 31)
+    with pytest.raises(ValueError):
+        Integer(-(2 ** 31) - 1)
+
+
+@pytest.mark.parametrize(
+    "_from, to, coerce",
+    [
+        (Integer(5), Long, False),
+        (Integer(5), Double, True),
+        (Integer(5), Float, True),
+        (Float(3.14), Double, True),
+    ],
 )
+def test_number_casting_succeeds(_from, to, coerce):
+    assert _from.to(to, coerce)
 
 
 @pytest.mark.parametrize(
-    "input_type",
+    "_from, to",
     [
-        BooleanType,
-        IntegerType,
-        LongType,
-        FloatType,
-        DoubleType,
-        DateType,
-        TimeType,
-        TimestampType,
-        TimestamptzType,
-        StringType,
-        UUIDType,
-        BinaryType,
+        (Integer(5), Double),
+        (Integer(5), Float),
+        (Long(5), Double),
+        (Long(5), Float),

Review comment:
       Should we check for some more cast fails here? Such as:
   ```
   (String("5"), Integer),
   (String("5"), Long),
   (String("5"), Double),
   (Long(5), Integer),
   (Integer(5), String),
   (UUID("f79c3e09-677c-4bbd-a479-3f349cb785e7"), Integer)
   ```
   I also noticed that `Time().to(Integer)` fails, I'm wondering if that should cast to the unix epoch instead (and vice versa?)

##########
File path: python/tox.ini
##########
@@ -23,7 +23,10 @@ usedevelop = true
 deps =
     coverage
     mock
+    numpy
     pytest
+    mmh3
+extras = dev

Review comment:
       This just installs the `[dev]` stuff listed in the setup.py right? I don't think we need this here since we define the test dependencies right above this.

##########
File path: python/tests/test_types.py
##########
@@ -15,149 +15,196 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import math
+
 import pytest
 
 from iceberg.types import (
-    BinaryType,
-    BooleanType,
-    DateType,
-    DecimalType,
-    DoubleType,
-    FixedType,
-    FloatType,
-    IntegerType,
-    ListType,
-    LongType,
-    MapType,
-    NestedField,
-    StringType,
-    StructType,
-    TimestampType,
-    TimestamptzType,
-    TimeType,
-    UUIDType,
+    UUID,
+    Binary,
+    Boolean,
+    Date,
+    Double,
+    Float,
+    Integer,
+    Long,
+    String,
+    Time,
+    Timestamp,
+    Timestamptz,
+)
+
+
+# https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements
+@pytest.mark.parametrize(
+    "instance,expected",
+    [
+        (UUID("f79c3e09-677c-4bbd-a479-3f349cb785e7"), 1488055340),
+        (Boolean(True), 1392991556),
+        (Integer(34), 2017239379),
+        (Long(34), 2017239379),
+        (Float(1), -142385009),
+        (Double(1), -142385009),
+        (String("iceberg"), 1210000089),
+        (Binary(b"\x00\x01\x02\x03"), -188683207),
+        (Date("2017-11-16"), -653330422),
+        (Time(22, 31, 8), -662762989),
+        (Timestamp("2017-11-16T14:31:08-08:00"), -2047944441),
+        (Timestamptz("2017-11-16T14:31:08-08:00"), -2047944441),
+    ],
+)
+def test_hashing(instance, expected):

Review comment:
       Should we also have a similar test but using the hash class methods?
   ```py
   @pytest.mark.parametrize(
       "instance,expected",
       [
           (UUID.hash("f79c3e09-677c-4bbd-a479-3f349cb785e7"), 1488055340),
           (Boolean.hash(True), 1392991556),
           (Integer.hash(34), 2017239379),
           (Long.hash(34), 2017239379),
           (Float.hash(1), -142385009),
           (Double.hash(1), -142385009),
           (String.hash("iceberg"), 1210000089),
           (Binary.hash(b"\x00\x01\x02\x03"), -188683207),
           (Date.hash("2017-11-16"), -653330422),
           (Time.hash(22, 31, 8), -662762989),
           (Timestamp.hash("2017-11-16T14:31:08-08:00"), -2047944441),
           (Timestamptz.hash("2017-11-16T14:31:08-08:00"), -2047944441),
       ],
   )
   def test_hashing_using_type_static_method(instance, expected):
       assert hash(instance) == expected
   ```
   The types could all have a `@staticmethod` hash function that can be passed around and used on raw values without having to instantiate the class first. Although the cost to just instantiate the class might actually be negligible. Thoughts?

##########
File path: python/src/iceberg/types.py
##########
@@ -15,157 +15,665 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional
+import struct
+from base64 import b64encode
+from datetime import date, datetime, time
+from decimal import Decimal as PythonDecimal
+from typing import Dict, Optional, Tuple, Type, Union
+from uuid import UUID as PythonUUID
+
+import mmh3
+from numpy import float32, float64, isinf, isnan
+
+
+class IcebergType:
+    """Base type for all Iceberg Types"""
+
+    def __setattr__(self, key, value):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be altered on isntance {self}")
+        object.__setattr__(self, key, value)
+
+    # freeze deleting of generic attributes
+    def __delattr__(self, key):
+        if key in getattr(self, "_frozen_attrs", set()) or key in getattr(
+            type(self), "_always_frozen", set()
+        ):
+            raise AttributeError(f"{key} may not be deleted on instance {self}")
+        object.__delattr__(self, key)
+
+    @classmethod
+    def can_cast(cls, _type: Type["IcebergType"]):
+        return cls == _type
+
+
+class PrimitiveType(IcebergType):
+    """
+    base type for primitives `IcebergType`s
+    Primitives include an instance attribute `value` which is used as the underlying value to work with the type
+    a `PrimitiveType` should type the instance `value` most specific to that type
+    """
+
+    value: Union[
+        bytes,
+        bool,
+        int,
+        float32,
+        float64,
+        PythonDecimal,
+        str,
+        dict,
+        PythonUUID,
+        date,
+        time,
+        datetime,
+    ]
+
+    def __init__(self, value):
+
+        if issubclass(type(value), PrimitiveType):
+            try:
+                self.value = value.to(type(self)).value
+            except AttributeError:
+                raise TypeError(f"Cannot convert {value} to type {type(self)}")
+        else:
+            self.value = value
+
+    def __repr__(self) -> str:
+        return f"{repr(type(self))}(value={self.value})"  # type: ignore
+
+    def __str__(self) -> str:
+        return f"{str(type(self))}({self.value})"  # type: ignore
+
+    def __bool__(self) -> bool:
+        return bool(self.value)
+
+    def __eq__(self, other) -> bool:
+        return type(other) == type(self) and self.value == other.value
+
+    def __bytes__(self) -> bytes:
+        return type(self).to_bytes(self.value)
+
+    @classmethod
+    def to_bytes(cls, value):
+        return bytes(value)
+
+    def __hash__(self) -> int:
+        """https://iceberg.apache.org/#spec/#appendix-b-32-bit-hash-requirements"""
+        return type(self).hash(self.value)
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(cls.to_bytes(value))
+
+    def to(self, _type: Type["PrimitiveType"], coerce: bool = False):
+        if type(self).can_cast(_type) or coerce:
+            return _type(self.value)
+        raise TypeError(f"Cannot cast {type(self)} to {_type}.")
+
+
+class Number(PrimitiveType):
+    """
+    base `PrimitiveType` for `IcebergType`s for numeric types
+    per https://iceberg.apache.org/#spec/#primitive-types these include int, long, float, double, decimal
+    """
+
+    value: Union[int, float32, float64, PythonDecimal]
+
+    def __float__(self) -> float:
+        return float(self.value)
+
+    def __int__(self) -> int:
+        return int(self.value)
+
+    def __math(self, op, other=None):
+        op_f = getattr(self.value, op)
+        try:
+            if op in (
+                "__add__",
+                "__sub__",
+                "__div__",
+                "__mul__",
+            ):
+                other = other.to(type(self))
+                return type(self)(op_f(other.value))
+            if op in (
+                "__pow__",
+                "__mod__",
+            ):
+                other = type(self)(other)
+                return type(self)(op_f(other.value))
+            if op in ("__lt__", "__eq__"):
+                other = other.to(type(self))
+                return op_f(other.value)
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        if op in ("__neg__", "__abs__"):
+            return type(self)(op_f())
+
+    def __add__(self, other: "Number") -> "Number":
+        return self.__math("__add__", other)
+
+    def __sub__(self, other: "Number") -> "Number":
+        return self.__math("__sub__", other)
+
+    def __mul__(self, other: "Number") -> "Number":
+        return self.__math("__mul__", other)
 
+    def __div__(self, other: "Number") -> "Number":
+        return self.__math("__div__", other)
 
-class Type(object):
-    def __init__(self, type_string: str, repr_string: str, is_primitive=False):
-        self._type_string = type_string
-        self._repr_string = repr_string
-        self._is_primitive = is_primitive
+    def __neg__(self) -> "Number":
+        return self.__math("__neg__")
+
+    def __abs__(self) -> "Number":
+        return self.__math("__abs__")
 
-    def __repr__(self):
-        return self._repr_string
+    def __pow__(self, other: "Number", mod: Optional["Number"] = None) -> "Number":
+        return self.__math("__pow__", other)
 
-    def __str__(self):
-        return self._type_string
+    def __mod__(self, other: "Number") -> "Number":
+        return self.__math("__mod__", other)
 
-    @property
-    def is_primitive(self) -> bool:
-        return self._is_primitive
+    def __lt__(self, other) -> bool:
+        return self.__math("__lt__", other)
 
+    def __eq__(self, other) -> bool:
+        return self.__math("__eq__", other) and self._neg == other._neg  # type: ignore
+
+    def __gt__(self, other) -> bool:
+        return not self.__le__(other)
+
+    def __le__(self, other) -> bool:
+        return self.__lt__(other) or self.__eq__(other)
+
+    def __ge__(self, other) -> bool:
+        return self.__gt__(other) or self.__eq__(other)
+
+    def __hash__(self) -> int:
+        return super().__hash__()
+
+
+class Integral(Number):
+    """base class for integral types Integer, Long
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Integer and Long
+    """
+
+    value: int
+    _neg: bool
+    _frozen_attrs = {"min", "max", "_neg"}
+
+    def __init__(self, value: Union[str, float, int]):
+        super().__init__(value)
+
+        if isinstance(self.value, Number):
+            self.value = int(self.value.value)
+        else:
+            self.value = int(self.value)
+        self._check()
+        object.__setattr__(self, "_neg", self.value < 0)
+
+    def _check(self) -> "Integral":
+        """
+        helper method for `Integal` specific `_check` to ensure value is within spec
+        """
+        if self.value > self.max:  # type: ignore
+            raise ValueError(f"{type(self)} must be less than or equal to {self.max}")  # type: ignore
+
+        if self.value < self.min:  # type: ignore
+            raise ValueError(
+                f"{type(self)} must be greater than or equal to {self.min}"  # type: ignore
+            )
+
+        return self
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("q", value)
+
+
+class Floating(Number):
+    """base class for floating types Float, Double
+
+    Note:
+       for internal iceberg use only
+
+    Examples:
+        Can be used in place of typing for Float and Double
+    """
+
+    _neg: bool
+    _frozen_attrs = {"_neg"}
+
+    def __init__(self, float_t, value: Union[float, str, int]):
+        super().__init__(value)
+        object.__setattr__(self, "_neg", str(self.value).strip().startswith("-"))
+        if isinstance(self.value, Number):
+            self.value = float_t(self.value.value)
+        else:
+            self.value = float_t(self.value)
+
+    @classmethod
+    def to_bytes(cls, value) -> bytes:
+        return struct.pack("d", value)
+
+    def __repr__(self) -> str:
+        ret = super().__repr__()
+        if self._neg and isnan(self.value):  # type: ignore
+            return ret.replace("nan", "-nan")
+        return ret
+
+    def is_nan(self) -> bool:
+        return isnan(self.value)  # type: ignore
+
+    def is_inf(self) -> bool:
+        return isinf(self.value)  # type: ignore
+
+    def __str__(self) -> str:
+        ret = super().__str__()
+        if self._neg and self.is_nan():
+            return ret.replace("nan", "-nan")
+        if self._neg and self.value == 0.0:
+            return ret.replace("0.0", "-0.0")
+        return ret
+
+    def __lt__(self, other: "Floating") -> bool:
+        try:
+            other = other.to(type(self))
+        except TypeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Perhaps try coercing to the appropriate type as {other}.to({type(self)}, coerce=True)."
+            )
+        except AttributeError:
+            raise TypeError(
+                f"Cannot compare {self} with {other}. Ensure try creating an appropriate type {type(self)}({other})."
+            )
+
+        def get_key(x) -> str:
+            if x.is_nan():
+                ret = "nan"
+            elif x.is_inf():
+                ret = "inf"
+            else:
+                return "value"
+            return ("-" if x._neg else "") + ret
+
+        ret_dict: Dict[Tuple[str, str], bool] = {
+            ("inf", "value"): False,
+            ("nan", "nan"): False,
+            ("-inf", "-inf"): False,
+            ("value", "inf"): True,
+            ("-inf", "-nan"): False,
+            ("-nan", "-nan"): False,
+            ("value", "-nan"): False,
+            ("-nan", "-inf"): True,
+            ("-inf", "inf"): True,
+            ("-nan", "nan"): True,
+            ("nan", "value"): False,
+            ("nan", "-nan"): False,
+            ("inf", "nan"): False,
+            ("-nan", "inf"): True,
+            ("inf", "inf"): False,
+            ("nan", "-inf"): False,
+            ("value", "value"): (self._neg and not other._neg) or (self.value < other.value),  # type: ignore
+            ("-nan", "value"): True,
+            ("value", "nan"): True,
+            ("-inf", "value"): True,
+            ("-inf", "nan"): True,
+            ("inf", "-inf"): False,
+            ("nan", "inf"): True,
+            ("value", "-inf"): False,
+            ("inf", "-nan"): False,
+        }
+        return ret_dict[(get_key(self), get_key(other))]
 
-class FixedType(Type):
-    def __init__(self, length: int):
-        super().__init__(
-            f"fixed[{length}]", f"FixedType(length={length})", is_primitive=True
-        )
-        self._length = length
 
-    @property
-    def length(self) -> int:
-        return self._length
+class Integer(Integral):
+    """32-bit signed integers: `int` from https://iceberg.apache.org/#spec/#primitive-types
+
+
+    Args:
+        value: value for which the integer will represent
 
+    Attributes:
+        value (int): the literal value contained by the `Integer`
+        max (int): the maximum value `Integer` may take on
+        min (int): the minimum value `Integer` may take on
 
-class DecimalType(Type):
-    def __init__(self, precision: int, scale: int):
-        super().__init__(
-            f"decimal({precision}, {scale})",
-            f"DecimalType(precision={precision}, scale={scale})",
-            is_primitive=True,
-        )
-        self._precision = precision
-        self._scale = scale
-
-    @property
-    def precision(self) -> int:
-        return self._precision
-
-    @property
-    def scale(self) -> int:
-        return self._scale
-
-
-class NestedField(object):
-    def __init__(
-        self,
-        is_optional: bool,
-        field_id: int,
-        name: str,
-        field_type: Type,
-        doc: Optional[str] = None,
-    ):
-        self._is_optional = is_optional
-        self._id = field_id
-        self._name = name
-        self._type = field_type
-        self._doc = doc
-
-    @property
-    def is_optional(self) -> bool:
-        return self._is_optional
-
-    @property
-    def is_required(self) -> bool:
-        return not self._is_optional
-
-    @property
-    def field_id(self) -> int:
-        return self._id
-
-    @property
-    def name(self) -> str:
-        return self._name
-
-    @property
-    def type(self) -> Type:
-        return self._type
-
-    def __repr__(self):
-        return (
-            f"NestedField(is_optional={self._is_optional}, field_id={self._id}, "
-            f"name={repr(self._name)}, field_type={repr(self._type)}, doc={repr(self._doc)})"
+    Examples:
+        >>> Integer(5)
+        Integer(value=5)
+
+        >>> Integer('3.14')
+        Integer(value=3)
+
+        >>> Integer(3.14)
+        Integer(value=3)
+
+    """
+
+    max: int = 2147483647
+    min: int = -2147483648
+
+    @classmethod
+    def can_cast(cls, _type):
+        return _type in (cls, Long)
+
+
+class Long(Integral):
+    """64-bit signed integers: `long` from https://iceberg.apache.org/#spec/#primitive-types
+
+
+    Args:
+        value: value for which the long will represent
+
+    Attributes:
+        value (int): the literal value contained by the `Long`
+        max (int): the maximum value `Long` may take on
+        min (int): the minimum value `Long` may take on
+
+    Examples:
+        >>> Long(5)
+        Long(value=5)
+
+        >>> Long('3.14')
+        Long(value=3)
+
+        >>> Long(3.14)
+        Long(value=3)
+    """
+
+    max: int = 9223372036854775807
+    min: int = -9223372036854775808
+
+
+class Float(Floating):
+    """32-bit IEEE 754 floating point: `float` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value: value for which the float will represent
+
+     Examples:
+        >>> Float(5)
+        Float(value=5.0)
+
+        >>> Float(3.14)
+        Float(value=3)
+    """
+
+    # float32 ensures spec
+    value: float32
+
+    def __init__(self, value):
+        super().__init__(float32, value)
+
+    @classmethod
+    def can_cast(cls, _type):
+        return _type in (cls, Double)
+
+
+class Double(Floating):
+    """64-bit IEEE 754 floating point: `double` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value: value for which the double will represent
+
+    Examples:
+        >>> Double(5)
+        Double(value=5.0)
+
+        >>> Double(3.14)
+        Double(value=3)
+
+    """
+
+    # float64 ensures spec
+    value: float64
+
+    def __init__(self, value):
+        super().__init__(float64, value)
+
+
+class Boolean(PrimitiveType):
+    """`boolean` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+            value (bool): value the boolean will represent
+
+    Examples:
+            >>>Boolean(True)
+            Boolean(value=True)
+    """
+
+    value: bool
+
+    def __bool__(self):
+        return self.value
+
+    @classmethod
+    def to_bytes(self, value) -> bytes:
+        return Integer.to_bytes(value)
+
+    def __hash__(self) -> int:
+        return super().__hash__()
+
+    def __eq__(self, other) -> bool:
+        return isinstance(other, Boolean) and self.value == other.value
+
+
+class String(PrimitiveType):
+    """Arbitrary-length character sequences Encoded with UTF-8: `string` from https://iceberg.apache.org/#spec/#primitive-types
+
+    Args:
+        value (str): value the string will represent
+
+    Attributes:
+        value (str): the literal value contained by the `String`
+
+    Examples:
+        >>> String("Hello")
+        String(value='Hello')
+    """
+
+    value: str
+
+    @classmethod
+    def hash(cls, value):
+        return mmh3.hash(value)
+
+
+class UUID(PrimitiveType):

Review comment:
       Does this class need a `hash` method?:




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs closed pull request #3714: Types literals all in one primitives pr

Posted by GitBox <gi...@apache.org>.
CircArgs closed pull request #3714:
URL: https://github.com/apache/iceberg/pull/3714


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org