You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2019/12/03 10:34:42 UTC

[GitHub] [commons-codec] aherbert commented on issue #33: Added string helper method to MurmurHash3.hash64

aherbert commented on issue #33: Added string helper method to MurmurHash3.hash64
URL: https://github.com/apache/commons-codec/pull/33#issuecomment-561106482
 
 
   I am against this:
   
   - It adds more convenience methods which do not save very much typing
   - It promotes the hash64 method which is not part of MurmurHash3
   - It uses String::getBytes without a character encoding
   
   This method saves a few characters over the alternative:
   
   ```
   String s;
   long h1 = MurmurHash3.hash64(s);
   long h2 = MurmurHash3.hash64(s.getBytes());
   ```
   
   The hash64 methods are not part of the original MurmurHash3 implementation. We do not know the statistical properties of these methods, i.e. the hash collision rate. The methods were ported from Apache Hive but the original reason for the method and its usage is unknown.
   
   The javadoc should be more clear that these methods do not return the same as either the upper of lower 64-bits of the hash128 method:
   
   ```
   String s;
   long[] h = MurmurHash3.hash128(s.getBytes());
   long h2 = MurmurHash3.hash64(s.getBytes());
   // This is not true only due to random hash collision:
   // h[0] != h2 && h[1] != h2;
   ```
   
   The main javadoc for hash64 states this but the other helper methods do not.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services