You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/12/28 13:01:52 UTC

[GitHub] [iceberg] hililiwei opened a new pull request #3815: API:Use murmur3_32_fixed instead of murmur3_32

hililiwei opened a new pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815


   Since  https://github.com/google/guava/issues/5648 has been resolved, submit this PR to remove unnecessary loops.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776513735



##########
File path: versions.props
##########
@@ -11,7 +11,7 @@ org.apache.spark:spark-avro_2.11 = 2.4.8
 org.apache.pig:pig = 0.14.0
 com.fasterxml.jackson.*:* = 2.11.4
 com.google.errorprone:error_prone_annotations = 2.3.3
-com.google.guava:guava = 28.0-jre
+com.google.guava:guava = 31.0.1-jre

Review comment:
       We shade it so I have no problems with the upgrade. May want to do it after release though just in case




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#issuecomment-1002799300


   @findepi, can you take a look at this also?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776877804



##########
File path: api/src/main/java/org/apache/iceberg/transforms/Bucket.java
##########
@@ -41,7 +41,7 @@
 import static org.apache.iceberg.types.Type.TypeID;
 
 abstract class Bucket<T> implements Transform<T, Integer> {
-  private static final HashFunction MURMUR3 = Hashing.murmur3_32();
+  private static final HashFunction MURMUR3 = Hashing.murmur3_32_fixed();

Review comment:
       Tests are in `TestBucketing`: https://github.com/apache/iceberg/blob/master/api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java
   
   This tests all of the examples from the spec, as well as type promotion and some edge cases like UTF-8 with surrogate pairs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776513464



##########
File path: versions.props
##########
@@ -11,7 +11,7 @@ org.apache.spark:spark-avro_2.11 = 2.4.8
 org.apache.pig:pig = 0.14.0
 com.fasterxml.jackson.*:* = 2.11.4
 com.google.errorprone:error_prone_annotations = 2.3.3
-com.google.guava:guava = 28.0-jre
+com.google.guava:guava = 31.0.1-jre

Review comment:
       @aokolnychyi, @RussellSpitzer, @jackye1995, any thoughts on upgrading Guava? It should be safe because we shade and relocate. Not sure if there have been behavior changes, though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] findepi commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
findepi commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776712584



##########
File path: api/src/main/java/org/apache/iceberg/transforms/Bucket.java
##########
@@ -41,7 +41,7 @@
 import static org.apache.iceberg.types.Type.TypeID;
 
 abstract class Bucket<T> implements Transform<T, Integer> {
-  private static final HashFunction MURMUR3 = Hashing.murmur3_32();
+  private static final HashFunction MURMUR3 = Hashing.murmur3_32_fixed();

Review comment:
       We should have hash stability tests to ensure hash values don't change. See https://github.com/trinodb/trino/blob/863da2b2d3f92dd858e028d30ceb63184dfb515d/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergBucketing.java#L122 for example.
   
   This can be followed up upon, for now we can rely on @wendigo's test in Trino.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] findepi commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
findepi commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r778877211



##########
File path: api/src/main/java/org/apache/iceberg/transforms/Bucket.java
##########
@@ -41,7 +41,7 @@
 import static org.apache.iceberg.types.Type.TypeID;
 
 abstract class Bucket<T> implements Transform<T, Integer> {
-  private static final HashFunction MURMUR3 = Hashing.murmur3_32();
+  private static final HashFunction MURMUR3 = Hashing.murmur3_32_fixed();

Review comment:
       @RussellSpitzer yes, we use different murmur3 implementation, that's how https://github.com/google/guava/issues/5648 was found.
   Yes, it matches the fixed version.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r778416930



##########
File path: api/src/main/java/org/apache/iceberg/transforms/Bucket.java
##########
@@ -41,7 +41,7 @@
 import static org.apache.iceberg.types.Type.TypeID;
 
 abstract class Bucket<T> implements Transform<T, Integer> {
-  private static final HashFunction MURMUR3 = Hashing.murmur3_32();
+  private static final HashFunction MURMUR3 = Hashing.murmur3_32_fixed();

Review comment:
       @findepi I was just going through the trino code and noticed it uses a different [Murmur3 Implementation](https://github.com/trinodb/trino/blob/fb88c0bf589db55d77223fc31aab29acbb2d0698/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionTransforms.java#L422), does this match the fixed version here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776877865



##########
File path: versions.props
##########
@@ -11,7 +11,7 @@ org.apache.spark:spark-avro_2.11 = 2.4.8
 org.apache.pig:pig = 0.14.0
 com.fasterxml.jackson.*:* = 2.11.4
 com.google.errorprone:error_prone_annotations = 2.3.3
-com.google.guava:guava = 28.0-jre
+com.google.guava:guava = 31.0.1-jre

Review comment:
       Agreed. We should do this after the release.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a change in pull request #3815: API: Upgrade Guava to use murmur3_32_fixed

Posted by GitBox <gi...@apache.org>.
hililiwei commented on a change in pull request #3815:
URL: https://github.com/apache/iceberg/pull/3815#discussion_r776987613



##########
File path: versions.props
##########
@@ -11,7 +11,7 @@ org.apache.spark:spark-avro_2.11 = 2.4.8
 org.apache.pig:pig = 0.14.0
 com.fasterxml.jackson.*:* = 2.11.4
 com.google.errorprone:error_prone_annotations = 2.3.3
-com.google.guava:guava = 28.0-jre
+com.google.guava:guava = 31.0.1-jre

Review comment:
       +1




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org