You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/07 18:59:25 UTC

[GitHub] [iceberg] Fokko opened a new pull request, #6141: Python: Make invalid Literal conversions explicit

Fokko opened a new pull request, #6141:
URL: https://github.com/apache/iceberg/pull/6141

   Currently, we silently turn Literals into None if we can't convert them, instead I prefer to raise an exception. This can cause silent bugs like: `EqualTo(Reference("id"), StringLiteral("123a"))` will turn into a `IsNull` predicate since `123a` cannot be casted to a `Long` (assuming that the `id` column is a Long). This PR also adds support of casting a `StringLiteral` to a `LongLiteral`.
   
   Blocked on https://github.com/apache/iceberg/pull/6140


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022106371


##########
python/tests/expressions/test_literals.py:
##########
@@ -312,12 +320,16 @@ def test_string_to_timestamp_literal():
 
 def test_timestamp_with_zone_without_zone_in_literal():
     timestamp_str = literal("2017-08-18T14:21:01.919234")
-    assert timestamp_str.to(TimestamptzType()) is None
+    with pytest.raises(ValueError) as e:
+        _ = timestamp_str.to(timestamp_str.to(TimestamptzType()))
+    assert "Invalid timestamp with zone: 2017-08-18T14:21:01.919234 (must be ISO-8601)" in str(e.value)

Review Comment:
   Would be nice to state that it is missing the zone offset.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1018963278


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -58,25 +60,25 @@
     timestamp_to_micros,
     timestamptz_to_micros,
 )
-from pyiceberg.utils.singleton import Singleton
 
 T = TypeVar("T")
 
 
 class Literal(Generic[T], ABC):
     """Literal which has a value and can be converted between types"""
 
-    def __init__(self, value: T, value_type: type):
+    def __init__(self, value: T, value_type: Type):
         if value is None or not isinstance(value, value_type):
             raise TypeError(f"Invalid literal value: {value} (not a {value_type})")
         self._value = value
 
     @property
     def value(self) -> T:
-        return self._value  # type: ignore
+        return self._value
 
+    @singledispatchmethod
     @abstractmethod
-    def to(self, type_var) -> Literal:
+    def to(self, type_var: IcebergType) -> Literal:

Review Comment:
   I added types to revive the TypeChecker, and it failed on this one. `to` returns a literal, therefore `AboveMax` and `BelowMin` also need to be literal. We can also turn this into a `Union[Literal, AboveMin, AboveMax]`. @rdblue thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019453639


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,73 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):
+    @singledispatchmethod

Review Comment:
   Added!



##########
python/pyiceberg/expressions/literals.py:
##########
@@ -309,23 +303,23 @@ def __init__(self, value: float):
         super().__init__(value, float)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert DoubleLiteral into {type_var}")
 
     @to.register(DoubleType)
-    def _(self, type_var: DoubleType) -> Literal[float]:
+    def _(self, _: DoubleType) -> Literal[float]:
         return self
 
     @to.register(FloatType)
-    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, Literal[float]]:
+    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, FloatLiteral]:
         if FloatType.max < self.value:
-            return AboveMax()
+            return AboveMax(FloatType.max, float)

Review Comment:
   Added!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019454610


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,71 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):

Review Comment:
   Yes, after splitting them out that works. They are a Singleton again 👍🏻 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022106643


##########
python/tests/expressions/test_literals.py:
##########
@@ -312,12 +320,16 @@ def test_string_to_timestamp_literal():
 
 def test_timestamp_with_zone_without_zone_in_literal():
     timestamp_str = literal("2017-08-18T14:21:01.919234")
-    assert timestamp_str.to(TimestamptzType()) is None
+    with pytest.raises(ValueError) as e:
+        _ = timestamp_str.to(timestamp_str.to(TimestamptzType()))
+    assert "Invalid timestamp with zone: 2017-08-18T14:21:01.919234 (must be ISO-8601)" in str(e.value)
 
 
 def test_timestamp_without_zone_with_zone_in_literal():
     timestamp_str = literal("2017-08-18T14:21:01.919234+07:00")
-    assert timestamp_str.to(TimestampType()) is None
+    with pytest.raises(ValueError) as e:
+        _ = timestamp_str.to(TimestampType())
+    assert "Could not convert 2017-08-18T14:21:01.919234+07:00 into a timestamp" in str(e.value)

Review Comment:
   Similar, nice to state that it had a zone offset but should not have.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019379366


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -58,25 +60,25 @@
     timestamp_to_micros,
     timestamptz_to_micros,
 )
-from pyiceberg.utils.singleton import Singleton
 
 T = TypeVar("T")
 
 
 class Literal(Generic[T], ABC):
     """Literal which has a value and can be converted between types"""
 
-    def __init__(self, value: T, value_type: type):
+    def __init__(self, value: T, value_type: Type):
         if value is None or not isinstance(value, value_type):
             raise TypeError(f"Invalid literal value: {value} (not a {value_type})")
         self._value = value
 
     @property
     def value(self) -> T:
-        return self._value  # type: ignore
+        return self._value
 
+    @singledispatchmethod
     @abstractmethod
-    def to(self, type_var) -> Literal:
+    def to(self, type_var: IcebergType) -> Literal:

Review Comment:
   I think AboveMax and BelowMin should be special literals. That works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019523027


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -58,25 +60,25 @@
     timestamp_to_micros,
     timestamptz_to_micros,
 )
-from pyiceberg.utils.singleton import Singleton
 
 T = TypeVar("T")
 
 
 class Literal(Generic[T], ABC):
     """Literal which has a value and can be converted between types"""
 
-    def __init__(self, value: T, value_type: type):
+    def __init__(self, value: T, value_type: Type):
         if value is None or not isinstance(value, value_type):
             raise TypeError(f"Invalid literal value: {value} (not a {value_type})")
         self._value = value
 
     @property
     def value(self) -> T:
-        return self._value  # type: ignore
+        return self._value
 
+    @singledispatchmethod
     @abstractmethod
-    def to(self, type_var) -> Literal:
+    def to(self, type_var: IcebergType) -> Literal:

Review Comment:
   I think this PR improves the typing since we return a `Literal` in all the cases (or raise an exception).
   
   To get this fully working, I think we need to add type information to the IcebergType, and then we can do the following:
   ```suggestion
       def to(self, type_var: IcebergType[T]) -> Literal[T]:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1017702146


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,71 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):

Review Comment:
   Had to turn the `AboveMax` into a literal, since the `to()` always returns a literal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1018963278


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -58,25 +60,25 @@
     timestamp_to_micros,
     timestamptz_to_micros,
 )
-from pyiceberg.utils.singleton import Singleton
 
 T = TypeVar("T")
 
 
 class Literal(Generic[T], ABC):
     """Literal which has a value and can be converted between types"""
 
-    def __init__(self, value: T, value_type: type):
+    def __init__(self, value: T, value_type: Type):
         if value is None or not isinstance(value, value_type):
             raise TypeError(f"Invalid literal value: {value} (not a {value_type})")
         self._value = value
 
     @property
     def value(self) -> T:
-        return self._value  # type: ignore
+        return self._value
 
+    @singledispatchmethod
     @abstractmethod
-    def to(self, type_var) -> Literal:
+    def to(self, type_var: IcebergType) -> Literal:

Review Comment:
   I added types to revive the TypeChecker, and it failed on this one. `to` returns a literal, therefore `AboveMax` and `BelowMin` also need to be literal. We can also turn this into a `Union[Literal, AboveMin, AboveMax]`. @rdblue thoughts? I think changing the AboveMax into a literal is the cleanest



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #6141:
URL: https://github.com/apache/iceberg/pull/6141


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019434140


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,73 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):
+    @singledispatchmethod

Review Comment:
   We have to add a `type: ignore` since mypy will complain that the signature is different, but that's a fair thing to do.



##########
python/pyiceberg/expressions/literals.py:
##########
@@ -309,23 +303,23 @@ def __init__(self, value: float):
         super().__init__(value, float)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert DoubleLiteral into {type_var}")
 
     @to.register(DoubleType)
-    def _(self, type_var: DoubleType) -> Literal[float]:
+    def _(self, _: DoubleType) -> Literal[float]:
         return self
 
     @to.register(FloatType)
-    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, Literal[float]]:
+    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, FloatLiteral]:
         if FloatType.max < self.value:
-            return AboveMax()
+            return AboveMax(FloatType.max, float)

Review Comment:
   Ah, I like that a lot!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#issuecomment-1314467124

   For follow up:
   1. Fix types returned by `to` when AboveMax/BelowMin are returned
   2. Add tests for binding with invalid conversions
   3. Update binding to rewrite expressions when AboveMax and BelowMin are returned
   4. Check whether we need AboveMax/BelowMin for longs, since Python uses arbitrary precision numbers
   5. Improve some of the error messages


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1018963278


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -58,25 +60,25 @@
     timestamp_to_micros,
     timestamptz_to_micros,
 )
-from pyiceberg.utils.singleton import Singleton
 
 T = TypeVar("T")
 
 
 class Literal(Generic[T], ABC):
     """Literal which has a value and can be converted between types"""
 
-    def __init__(self, value: T, value_type: type):
+    def __init__(self, value: T, value_type: Type):
         if value is None or not isinstance(value, value_type):
             raise TypeError(f"Invalid literal value: {value} (not a {value_type})")
         self._value = value
 
     @property
     def value(self) -> T:
-        return self._value  # type: ignore
+        return self._value
 
+    @singledispatchmethod
     @abstractmethod
-    def to(self, type_var) -> Literal:
+    def to(self, type_var: IcebergType) -> Literal:

Review Comment:
   I added types to revive the TypeChecker, and it failed on this one. `to` returns a literal, therefore `AboveMax` and `BelowMin` also need to be literal. We can also turn this into a `Union[Literal, AboveMin, AboveMax]`. @rdblue thoughts? I like the signature of the function as it is right now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019384042


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,73 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):
+    @singledispatchmethod

Review Comment:
   Since there's only one implementation of `to`, can we remove this annotation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022102882


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -309,19 +336,19 @@ def __init__(self, value: float):
         super().__init__(value, float)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert DoubleLiteral into {type_var}")
 
     @to.register(DoubleType)
-    def _(self, type_var: DoubleType) -> Literal[float]:
+    def _(self, _: DoubleType) -> Literal[float]:
         return self
 
     @to.register(FloatType)
-    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, Literal[float]]:
+    def _(self, _: FloatType) -> Union[FloatAboveMax, FloatBelowMin, FloatLiteral]:

Review Comment:
   This can return `Literal[float]` now, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022106845


##########
python/tests/expressions/test_literals.py:
##########
@@ -448,35 +466,61 @@ def test_fixed_to_binary():
 
 def test_fixed_to_smaller_fixed_none():
     lit = literal(bytearray([0x00, 0x01, 0x02])).to(FixedType(3))
-    assert lit.to(FixedType(2)) is None
+    with pytest.raises(ValueError) as e:
+        lit.to(lit.to(FixedType(2)))
+    assert "Could not convert b'\\x00\\x01\\x02' into a fixed[2]" in str(e.value)

Review Comment:
   Different length?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022105799


##########
python/tests/expressions/test_literals.py:
##########
@@ -251,15 +259,15 @@ def test_timestamp_to_date():
     assert date_lit.value == 0
 
 
-# STRING

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1017702146


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,71 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):

Review Comment:
   Had to turn the `AboveMax` into a literal, since the `to()` always returns a literal. We pass in the Max value of the `{int,long}` that it exceeds.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019385197


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -309,23 +303,23 @@ def __init__(self, value: float):
         super().__init__(value, float)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert DoubleLiteral into {type_var}")
 
     @to.register(DoubleType)
-    def _(self, type_var: DoubleType) -> Literal[float]:
+    def _(self, _: DoubleType) -> Literal[float]:
         return self
 
     @to.register(FloatType)
-    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, Literal[float]]:
+    def _(self, _: FloatType) -> Union[AboveMax, BelowMin, FloatLiteral]:
         if FloatType.max < self.value:
-            return AboveMax()
+            return AboveMax(FloatType.max, float)

Review Comment:
   I see, it looks like this is why these are no longer singletons. What about `FloatAboveMax` and `IntegerAboveMax` instead? We will only have 2.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019381524


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -108,7 +110,7 @@ def __ge__(self, other):
 
 
 @singledispatch
-def literal(value) -> Literal:
+def literal(value: Any) -> Literal:

Review Comment:
   Shouldn't this be `value: T` and `Literal[T]`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019389663


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -108,7 +110,7 @@ def __ge__(self, other):
 
 
 @singledispatch
-def literal(value) -> Literal:
+def literal(value: Any) -> Literal:

Review Comment:
   Yes, but this opens up many more type errors which I'd like to do in a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019382620


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,71 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:
     return BooleanLiteral(value)
 
 
 @literal.register(int)
-def _(value: int) -> Literal[int]:
+def _(value: int) -> LongLiteral:
     return LongLiteral(value)
 
 
 @literal.register(float)
-def _(value: float) -> Literal[float]:
+def _(value: float) -> DoubleLiteral:
     # expression binding can convert to FloatLiteral if needed
     return DoubleLiteral(value)
 
 
 @literal.register(str)
-def _(value: str) -> Literal[str]:
+def _(value: str) -> StringLiteral:
     return StringLiteral(value)
 
 
 @literal.register(UUID)
-def _(value: UUID) -> Literal[UUID]:
+def _(value: UUID) -> UUIDLiteral:
     return UUIDLiteral(value)
 
 
 @literal.register(bytes)
-def _(value: bytes) -> Literal[bytes]:
+def _(value: bytes) -> BinaryLiteral:
     # expression binding can convert to FixedLiteral if needed
     return BinaryLiteral(value)
 
 
 @literal.register(bytearray)
-def _(value: bytearray) -> Literal[bytes]:
+def _(value: bytearray) -> BinaryLiteral:
     return BinaryLiteral(bytes(value))
 
 
 @literal.register(Decimal)
-def _(value: Decimal) -> Literal[Decimal]:
+def _(value: Decimal) -> DecimalLiteral:
     return DecimalLiteral(value)
 
 
 @literal.register(date)
-def _(value: date) -> Literal[int]:
+def _(value: date) -> DateLiteral:
     return DateLiteral(date_to_days(value))
 
 
-class AboveMax(Singleton):
-    @property
-    def value(self):
-        raise ValueError("AboveMax has no value")
-
-    def to(self, type_var):
+class AboveMax(Literal):

Review Comment:
   I think this is correct. Can we still make it a `Singleton`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019382186


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,73 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:

Review Comment:
   Why require these are a specific literal rather than using the generic `Literal[bool]`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019433513


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -125,81 +127,73 @@ def literal(value) -> Literal:
 
 
 @literal.register(bool)
-def _(value: bool) -> Literal[bool]:
+def _(value: bool) -> BooleanLiteral:

Review Comment:
   `BooleanLiteral` extends from `Literal[bool]`, so we get the `Literal[bool]` for free. I think it is nice to be specific here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6141:
URL: https://github.com/apache/iceberg/pull/6141#discussion_r1022104689


##########
python/pyiceberg/expressions/literals.py:
##########
@@ -377,79 +404,96 @@ def __init__(self, value: Decimal):
         super().__init__(value, Decimal)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert DecimalLiteral into {type_var}")
 
     @to.register(DecimalType)
-    def _(self, type_var: DecimalType) -> Optional[Literal[Decimal]]:
+    def _(self, type_var: DecimalType) -> Literal[Decimal]:
         if type_var.scale == abs(self.value.as_tuple().exponent):
             return self
-        return None
+        raise ValueError(f"Could not convert {self.value} into a {type_var}")
 
 
 class StringLiteral(Literal[str]):
     def __init__(self, value: str):
         super().__init__(value, str)
 
     @singledispatchmethod
-    def to(self, type_var):
-        return None
+    def to(self, type_var: IcebergType) -> Literal:
+        raise TypeError(f"Cannot convert StringLiteral into {type_var}")
 
     @to.register(StringType)
-    def _(self, type_var: StringType) -> Literal[str]:
+    def _(self, _: StringType) -> Literal[str]:
         return self
 
+    @to.register(IntegerType)
+    def _(self, type_var: IntegerType) -> Union[IntAboveMax, IntBelowMin, LongLiteral]:

Review Comment:
   `Literal[int]`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org