You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2022/10/05 06:02:00 UTC

[jira] [Resolved] (SPARK-40660) Switch to XORShiftRandom to distribute elements

     [ https://issues.apache.org/jira/browse/SPARK-40660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-40660.
---------------------------------
    Fix Version/s: 3.4.0
       Resolution: Fixed

Issue resolved by pull request 38106
[https://github.com/apache/spark/pull/38106]

> Switch to XORShiftRandom to distribute elements
> -----------------------------------------------
>
>                 Key: SPARK-40660
>                 URL: https://issues.apache.org/jira/browse/SPARK-40660
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Yuming Wang
>            Assignee: Yuming Wang
>            Priority: Major
>             Fix For: 3.4.0
>
>
> {code:scala}
> import java.util.Random
> import org.apache.spark.util.random.XORShiftRandom
> import scala.util.hashing
> def distribution(count: Int, partition: Int) = {
>   println((1 to count).map(partitionId => new Random(partitionId).nextInt(partition))
>     .groupBy(f => f)
>     .map(_._2.size).mkString(". "))
>   println((1 to count).map(partitionId => new Random(hashing.byteswap32(partitionId)).nextInt(partition))
>     .groupBy(f => f)
>     .map(_._2.size).mkString(". "))
>   println((1 to count).map(partitionId => new XORShiftRandom(partitionId).nextInt(partition))
>     .groupBy(f => f)
>     .map(_._2.size).mkString(". "))
> }
> distribution(200, 4)
> {code}
> {noformat}
> 200
> 50. 60. 46. 44
> 55. 48. 43. 54
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org