You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/24 22:13:48 UTC

[GitHub] [iceberg] jun-he opened a new pull request #3965: implement __eq__ for iceberg type classes.

jun-he opened a new pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965


   Implement __eq__ method for type classes to check the equality.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r793242929



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):

Review comment:
       ```suggestion
   def test_non_parameterized_type_equality(input_type, equal_object, not_equal_object):
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r794838456



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       Yes, these should be immutable. Is it possible to make them singletons, maybe by hijacking `__new__`? I'm not sure what is normal here. @CircArgs may have an idea.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1024605152


   In python objects can be considered unhashable (this is the default) which could be a design choice.  This wouldn't lead to correctness issues like have


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796841774



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @CircArgs wondering if it is better to keep Singleton out so any classes can extend it to be singleton? l am also OK to move it into the `__new__` as well. What do you think?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r794840781



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):
+    assert equal_object == input_type()
+    assert not_equal_object != input_type()

Review comment:
       I think the problem of doing it that way is that you end up using equality to build the test cases for inequality. So by filtering the list you're testing that objects are unequal if they were not filtered because they were equal.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r794862364



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):
+    assert equal_object == input_type()
+    assert not_equal_object != input_type()

Review comment:
       One could use index in the list equality check to avoid this problem




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796329557



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @CircArgs Yeah, definitely. If you plan to make separate new instead of a single one, we can split the work into 2 parts, one for non parameterized one (this PR) and the other for the parameterized one (#4016). What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795145103



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):
+    assert equal_object == input_type()
+    assert not_equal_object != input_type()

Review comment:
       Let me give it a try. Will update it very soon.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r794838456



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       Yes, these should be immutable. Is it possible to make them singletons, maybe by hijacking `__new__`? I'm not sure what is normal here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795135703



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       We can do that for some but not for types, e.g. FixedType.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
CircArgs commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r797036687



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @jun-he the Singleton way would make sense to me if it had use beyond types I guess. If the functionality is limited to types I don't know if I can see the difference in functionality since you'll be inheriting from it and rather than hitting the `__new__` on the first parent it will hit it on the second in the case of `class ...(Type, Singleton):...` I'll defer to your judgement on it. Maybe Singleton will make it more clear what's going on




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796852405



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):
+    assert equal_object == input_type()
+    assert not_equal_object != input_type()

Review comment:
       Updated by using pytest features.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
CircArgs commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r797036687



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @jun-he the Singleton way would make sense to me if it had use beyond types I guess. If the functionality is limited to types I don't know if I can see the difference in functionality since you'll be inheriting from it and rather than hitting the `__new__` on the first parent it will hit it on the second in the case of `class ...(Type, Singleton):...`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
CircArgs commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796788099



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @jun-he I did end up doing separate `__new__` for all parameterized types. I put a `__new__` on my `PrimitiveType` but putting it on `Type` would work equally well I think. Do you intend to keep your `Singleton` if you can put the `__new__` on `Type`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1026551586


   > Not sure if this behavior may be undesirable
   > 
   > ```python
   > >>> FixedType(length=8) is FixedType(length=8)
   > False
   > 
   > >>> FixedType(length=8) == FixedType(length=8)
   > True
   > ```
   > 
   > is that the desired behavior?
   
   Thanks @CircArgs for pointing it out. `is` is used to check the reference equality. The implementation here seems matching what is implemented in Java, i.e. 
   ```
       Assert.assertTrue(Types.FixedType.ofLength(10) != Types.FixedType.ofLength(10));
       Assert.assertTrue(Types.FixedType.ofLength(10).equals(Types.FixedType.ofLength(10)));
   ```
   
   I think the issue here is related to the consistency. For non-parameterized types, `is` and `==` return the same results but for non non parameterized types, they are not.
   
   If we want to make `is` to be the same as `==`, one way is that we can cache created non parameterized types, which seems to be the approach in https://github.com/apache/iceberg/pull/4016.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796839065



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +191,46 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_index,input_type",
+    [
+        (1, BooleanType),
+        (2, IntegerType),
+        (3, LongType),
+        (4, FloatType),
+        (5, DoubleType),
+        (6, DateType),
+        (7, TimeType),
+        (8, TimestampType),
+        (9, TimestamptzType),
+        (10, StringType),
+        (11, UUIDType),
+        (12, BinaryType),
+    ],

Review comment:
       can you use iteratools.permutations and one list to avoid the duplication?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796851951



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +191,46 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_index,input_type",
+    [
+        (1, BooleanType),
+        (2, IntegerType),
+        (3, LongType),
+        (4, FloatType),
+        (5, DoubleType),
+        (6, DateType),
+        (7, TimeType),
+        (8, TimestampType),
+        (9, TimestamptzType),
+        (10, StringType),
+        (11, UUIDType),
+        (12, BinaryType),
+    ],

Review comment:
       @emkornfield  Instead of using `iteratools`, I defined a non_parameterized_types variable and then removed all duplications in this test and also another test as well.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795135703



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       We can do that. For some types, e.g. FixedType, it then will cache all created Fixed types.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795268616



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       Yes, I think we should do that for most cases rather than overriding `__eq__`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1028407625


   Looks good!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795135703



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       We can do that as describe at https://python-patterns.guide/gang-of-four/singleton/. For some types, e.g. FixedType, it then will cache all created Fixed types. So maybe we just make singletons for simple ones.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795138081



##########
File path: python/tests/test_types.py
##########
@@ -129,6 +150,11 @@ def test_map_type():
     assert isinstance(type_var.value.type, UUIDType)
     assert type_var.value.field_id == 2
     assert str(type_var) == str(eval(repr(type_var)))
+    assert type_var == eval(repr(type_var))
+    assert type_var != MapType(
+        NestedField(True, 1, "optional_field", LongType()),
+        NestedField(False, 2, "required_field", UUIDType()),
+    )

Review comment:
       Yeah, added it. Thanks for the comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
CircArgs commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796048880



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       @rdblue Thanks for drawing me to this. I think using `__new__` is a great idea you've mentioned a few times elsewhere. It also solved issues with the generics I was trying to make because the syntax is very vague especially for nested fields without any named arguments. I've been working on something since we spoke I've had sitting this morning. @jun-he this seems quite similar https://github.com/apache/iceberg/pull/4016. would definitely be interested in collaborating on this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
CircArgs commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1026193882


   Not sure if this behavior may be undesirable
   
   ```python
   >>> FixedType(length=8) is FixedType(length=8)
   False
   
   >>> FixedType(length=8) == FixedType(length=8)
   True
   ```
   
   is that the desired behavior?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796962564



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +191,46 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_index,input_type",
+    [
+        (1, BooleanType),
+        (2, IntegerType),
+        (3, LongType),
+        (4, FloatType),
+        (5, DoubleType),
+        (6, DateType),
+        (7, TimeType),
+        (8, TimestampType),
+        (9, TimestamptzType),
+        (10, StringType),
+        (11, UUIDType),
+        (12, BinaryType),
+    ],

Review comment:
       Looks great.  Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r794839625



##########
File path: python/tests/test_types.py
##########
@@ -129,6 +150,11 @@ def test_map_type():
     assert isinstance(type_var.value.type, UUIDType)
     assert type_var.value.field_id == 2
     assert str(type_var) == str(eval(repr(type_var)))
+    assert type_var == eval(repr(type_var))
+    assert type_var != MapType(
+        NestedField(True, 1, "optional_field", LongType()),
+        NestedField(False, 2, "required_field", UUIDType()),
+    )

Review comment:
       Agreed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1025261159


   > In python objects can be considered unhashable (this is the default) which could be a design choice. This wouldn't lead to correctness issues like have
   
   Isn't it reasonable to use type classes in maps though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1024584042


   @jun-he, the `__eq__` implementations look good to me. It would be good to fix some of the test suggestions that @emkornfield pointed out, along with fixing hash functions. In Java, you always override hash functions if you override equality and I don't see why you wouldn't in Python as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1025363239


   > Isn't it reasonable to use type classes in maps though?
   
   Yes.  Just pointing out that it isn't critical for correctness (i.e. can be done in a separate PR).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r793241570



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       Are these objects intended to be immutable?  is there any reason to implement __hash__ for them?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r796842369



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +191,46 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_index,input_type",
+    [
+        (1, BooleanType),
+        (2, IntegerType),
+        (3, LongType),
+        (4, FloatType),
+        (5, DoubleType),
+        (6, DateType),
+        (7, TimeType),
+        (8, TimestampType),
+        (9, TimestamptzType),
+        (10, StringType),
+        (11, UUIDType),
+        (12, BinaryType),
+    ],

Review comment:
       Let me check it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795138122



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):

Review comment:
       👍  Updated the PR accordingly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#issuecomment-1025444415


   @rdblue @emkornfield  Thanks for the comments. I improved the type classes using singleton pattern for non parameterized types. No equal or hash overriding is needed for them and default worked well.
   
   For parameterized ones, I still keep the equal method in this PR as currently type objects are not used as key in dict or set. It makes sense to use as key in the future and I will open a new PR to add hash overrides for them.
   
   I also updated the tests according to the comments. Can you please take another look? Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r795135703



##########
File path: python/src/iceberg/types.py
##########
@@ -46,6 +46,9 @@ def __repr__(self):
     def __str__(self):
         return self._type_string
 
+    def __eq__(self, other):

Review comment:
       We can do that. For some types, e.g. FixedType, it then will cache all created Fixed types. So maybe we just make singletons for simple ones.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r793242369



##########
File path: python/tests/test_types.py
##########
@@ -129,6 +150,11 @@ def test_map_type():
     assert isinstance(type_var.value.type, UUIDType)
     assert type_var.value.field_id == 2
     assert str(type_var) == str(eval(repr(type_var)))
+    assert type_var == eval(repr(type_var))
+    assert type_var != MapType(
+        NestedField(True, 1, "optional_field", LongType()),
+        NestedField(False, 2, "required_field", UUIDType()),
+    )

Review comment:
       should there be another test case to test inequality of the Value NestedField as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on a change in pull request #3965: implement __eq__ for iceberg type classes.

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #3965:
URL: https://github.com/apache/iceberg/pull/3965#discussion_r793244485



##########
File path: python/tests/test_types.py
##########
@@ -161,3 +187,25 @@ def test_nested_field():
     assert field_var.field_id == 1
     assert isinstance(field_var.type, StructType)
     assert str(field_var) == str(eval(repr(field_var)))
+
+
+@pytest.mark.parametrize(
+    "input_type,equal_object,not_equal_object",
+    [
+        (BooleanType, BooleanType(), IntegerType()),
+        (IntegerType, IntegerType(), BooleanType()),
+        (LongType, LongType(), DoubleType()),
+        (FloatType, FloatType(), IntegerType()),
+        (DoubleType, DoubleType(), FloatType()),
+        (DateType, DateType(), TimeType()),
+        (TimeType, TimeType(), TimestampType()),
+        (TimestampType, TimestampType(), TimestamptzType()),
+        (TimestamptzType, TimestamptzType(), TimestampType()),
+        (StringType, StringType(), UUIDType()),
+        (UUIDType, UUIDType(), StringType()),
+        (BinaryType, BinaryType(), FixedType(6)),
+    ],
+)
+def test_primitive_equality(input_type, equal_object, not_equal_object):
+    assert equal_object == input_type()
+    assert not_equal_object != input_type()

Review comment:
       I doubt this will change much but for the negative case you can use a [product](https://docs.python.org/3/library/itertools.html#itertools.product) and filter out objects of the same type to do an exhaustive test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org