You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ad...@apache.org on 2019/11/07 04:00:10 UTC

[kudu] 01/02: [cfile] Improve the hash function of pair

This is an automated email from the ASF dual-hosted git repository.

adar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 6133878108256174e580115b4738f29da3fa58d7
Author: lingbin <li...@gmail.com>
AuthorDate: Wed Nov 6 20:36:01 2019 +0800

    [cfile] Improve the hash function of pair<DataType, EncodingType>
    
    In the original implementation, because `DataType` part and
    `EncodingType` part will use the same bits, it is easy to
    have hash collisions.
    
    For example, the hash values of <UINT32=4, BIT_SHUFFLE=6>
    and <INT32=5, PLAIN_ENCODING=1> are both equal to `100101`
    
    This change also removed an unused member from `TypeEncodingTraits`
    
    Change-Id: Id141c51147ae674b9bee3016026a0b91cb76b5aa
    Reviewed-on: http://gerrit.cloudera.org:8080/14644
    Tested-by: Kudu Jenkins
    Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
 src/kudu/cfile/type_encodings.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/kudu/cfile/type_encodings.cc b/src/kudu/cfile/type_encodings.cc
index 9087ff3..55c258d 100644
--- a/src/kudu/cfile/type_encodings.cc
+++ b/src/kudu/cfile/type_encodings.cc
@@ -49,7 +49,6 @@ struct DataTypeEncodingTraits {};
 template<DataType Type, EncodingType Encoding> struct TypeEncodingTraits
   : public DataTypeEncodingTraits<Type, Encoding> {
 
-  static const DataType kType = Type;
   static const EncodingType kEncodingType = Encoding;
 };
 
@@ -206,7 +205,7 @@ Status TypeEncodingInfo::CreateBlockBuilder(
 
 struct EncodingMapHash {
   size_t operator()(pair<DataType, EncodingType> pair) const {
-    return (pair.first + 31) ^ pair.second;
+    return (pair.first << 5) + pair.second;
   }
 };