You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mridul Muralidharan (Jira)" <ji...@apache.org> on 2021/08/06 14:51:00 UTC

[jira] [Resolved] (SPARK-36423) Randomize blocks within a push request before pushing to improve block merge ratio

     [ https://issues.apache.org/jira/browse/SPARK-36423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mridul Muralidharan resolved SPARK-36423.
-----------------------------------------
    Fix Version/s: 3.2.0
       Resolution: Fixed

Issue resolved by pull request 33649
[https://github.com/apache/spark/pull/33649]

> Randomize blocks within a push request before pushing to improve block merge ratio
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-36423
>                 URL: https://issues.apache.org/jira/browse/SPARK-36423
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Shuffle, Spark Core
>    Affects Versions: 3.2.0
>            Reporter: Min Shen
>            Assignee: Min Shen
>            Priority: Major
>             Fix For: 3.2.0
>
>
> On the client side, we are currently randomizing the order of push requests before processing each request. In addition we canĀ further randomize the order of blocks within each push request before pushing them.
> In our benchmark, this has resulted in a 60%-70% reduction of blocks that fail to be merged due to bock collision (the existing block merge ratio is already pretty good in general, and this further improves it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org