You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/03 01:17:36 UTC

[GitHub] [iceberg] fb913bf0de288ba84fe98f7a23d35edfdb22381 opened a new pull request, #6110: API: Hash floats -0.0 and 0.0 to the same bucket

fb913bf0de288ba84fe98f7a23d35edfdb22381 opened a new pull request, #6110:
URL: https://github.com/apache/iceberg/pull/6110

   This commit changes the hashing of float/double for negative zero (-0.0) to produce the same hash as for positive zero (0.0). This is needed because positive/ negative zero are considered equal according to ==, but have distinct bit patterns according to
   Double#doubleToLongBits. Therefore, using these bit patterns directly would violate the assumption "x = y implies bucket(x) = bucket(y)". With this commit, the bit pattern for negative zero is replaced with the bit pattern for positive zero before hashing, to ensure both produce the same hash.
   
   Double negative zero has the bit pattern 0x8000000000000000, (which matches the bit pattern of Long.MIN_VALUE), whereas the bit pattern for positive zero is 0x0000000000000000.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #6110:
URL: https://github.com/apache/iceberg/pull/6110


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6110:
URL: https://github.com/apache/iceberg/pull/6110#issuecomment-1304892916

   Thanks, @fb913bf0de288ba84fe98f7a23d35edfdb22381! Looks great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6110:
URL: https://github.com/apache/iceberg/pull/6110#discussion_r1013224969


##########
api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java:
##########
@@ -65,10 +65,17 @@ public void testSpecValues() {
     Assert.assertEquals("Spec example: hash(true) = 1392991556", 1392991556, BucketUtil.hash(1));
     Assert.assertEquals("Spec example: hash(34) = 2017239379", 2017239379, BucketUtil.hash(34));
     Assert.assertEquals("Spec example: hash(34L) = 2017239379", 2017239379, BucketUtil.hash(34L));
+
     Assert.assertEquals(
         "Spec example: hash(17.11F) = -142385009", -142385009, BucketUtil.hash(1.0F));
     Assert.assertEquals(
         "Spec example: hash(17.11D) = -142385009", -142385009, BucketUtil.hash(1.0D));
+    Assert.assertEquals("Spec example: hash(0.0F) = 1669671676", 1669671676, BucketUtil.hash(0.0F));
+    Assert.assertEquals(
+        "Spec example: hash(-0.0F) = 1669671676", 1669671676, BucketUtil.hash(-0.0F));
+    Assert.assertEquals("Spec example: hash(0.0) = 1669671676", 1669671676, BucketUtil.hash(0.0));
+    Assert.assertEquals("Spec example: hash(-0.0) = 1669671676", 1669671676, BucketUtil.hash(-0.0));

Review Comment:
   Can you add NaN tests as well since this updates the spec for NaN, too?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] fb913bf0de288ba84fe98f7a23d35edfdb22381 commented on a diff in pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

Posted by GitBox <gi...@apache.org>.
fb913bf0de288ba84fe98f7a23d35edfdb22381 commented on code in PR #6110:
URL: https://github.com/apache/iceberg/pull/6110#discussion_r1013619657


##########
api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java:
##########
@@ -65,10 +65,17 @@ public void testSpecValues() {
     Assert.assertEquals("Spec example: hash(true) = 1392991556", 1392991556, BucketUtil.hash(1));
     Assert.assertEquals("Spec example: hash(34) = 2017239379", 2017239379, BucketUtil.hash(34));
     Assert.assertEquals("Spec example: hash(34L) = 2017239379", 2017239379, BucketUtil.hash(34L));
+
     Assert.assertEquals(
         "Spec example: hash(17.11F) = -142385009", -142385009, BucketUtil.hash(1.0F));
     Assert.assertEquals(
         "Spec example: hash(17.11D) = -142385009", -142385009, BucketUtil.hash(1.0D));
+    Assert.assertEquals("Spec example: hash(0.0F) = 1669671676", 1669671676, BucketUtil.hash(0.0F));
+    Assert.assertEquals(
+        "Spec example: hash(-0.0F) = 1669671676", 1669671676, BucketUtil.hash(-0.0F));
+    Assert.assertEquals("Spec example: hash(0.0) = 1669671676", 1669671676, BucketUtil.hash(0.0));
+    Assert.assertEquals("Spec example: hash(-0.0) = 1669671676", 1669671676, BucketUtil.hash(-0.0));

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org