You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2019/09/18 17:41:09 UTC

[spark] branch branch-2.4 updated: [SPARK-29124][CORE] Use MurmurHash3 `bytesHash(data, seed)` instead of `bytesHash(data)`

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new cc0f659  [SPARK-29124][CORE] Use MurmurHash3 `bytesHash(data, seed)` instead of `bytesHash(data)`
cc0f659 is described below

commit cc0f659cb95639c68288debec35aca1205b043c7
Author: Dongjoon Hyun <dh...@apple.com>
AuthorDate: Wed Sep 18 10:33:03 2019 +0900

    [SPARK-29124][CORE] Use MurmurHash3 `bytesHash(data, seed)` instead of `bytesHash(data)`
    
    This PR changes `bytesHash(data)` API invocation with the underlaying `byteHash(data, arraySeed)` invocation.
    ```scala
    def bytesHash(data: Array[Byte]): Int = bytesHash(data, arraySeed)
    ```
    
    The original API is changed between Scala versions by the following commit. From Scala 2.12.9, the semantic of the function is changed. If we use the underlying form, we are safe during Scala version migration.
    - https://github.com/scala/scala/commit/846ee2b1a47014c69ebd2352d91d467be74918b5#diff-ac889f851e109fc4387cd738d52ce177
    
    No.
    
    This is a kind of refactoring.
    
    Pass the Jenkins with the existing tests.
    
    Closes #25821 from dongjoon-hyun/SPARK-SCALA-HASH.
    
    Authored-by: Dongjoon Hyun <dh...@apple.com>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit 3ece8ee15775307bded572ac391aeed10be3c9aa)
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala b/core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala
index e8cdb6e..67ad513 100644
--- a/core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala
+++ b/core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala
@@ -62,7 +62,7 @@ private[spark] object XORShiftRandom {
   /** Hash seeds to have 0/1 bits throughout. */
   private[random] def hashSeed(seed: Long): Long = {
     val bytes = ByteBuffer.allocate(java.lang.Long.SIZE).putLong(seed).array()
-    val lowBits = MurmurHash3.bytesHash(bytes)
+    val lowBits = MurmurHash3.bytesHash(bytes, MurmurHash3.arraySeed)
     val highBits = MurmurHash3.bytesHash(bytes, lowBits)
     (highBits.toLong << 32) | (lowBits.toLong & 0xFFFFFFFFL)
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org