You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sujith (JIRA)" <ji...@apache.org> on 2019/01/25 08:54:00 UTC

[jira] [Comment Edited] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

    [ https://issues.apache.org/jira/browse/SPARK-22229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752044#comment-16752044 ] 

Sujith edited comment on SPARK-22229 at 1/25/19 8:53 AM:
---------------------------------------------------------

@[~yuvaldeg]  May i know where i can find PR related to new [SparkRDMA|https://github.com/Mellanox/SparkRDMA] implementation. just wanted to evaluate it further, quite interesting feature 


was (Author: s71955):
@[~yuvaldeg]  May i know where i can find PR related to new [SparkRDMA|https://github.com/Mellanox/SparkRDMA] implementation. We want to evaluate it further, quite interesting feature 

> SPIP: RDMA Accelerated Shuffle Engine
> -------------------------------------
>
>                 Key: SPARK-22229
>                 URL: https://issues.apache.org/jira/browse/SPARK-22229
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: Yuval Degani
>            Priority: Major
>         Attachments: SPARK-22229_SPIP_RDMA_Accelerated_Shuffle_Engine_Rev_1.0.pdf
>
>
> An RDMA-accelerated shuffle engine can provide enormous performance benefits to shuffle-intensive Spark jobs, as demonstrated in the “SparkRDMA” plugin open-source project ([https://github.com/Mellanox/SparkRDMA]).
> Using RDMA for shuffle improves CPU utilization significantly and reduces I/O processing overhead by bypassing the kernel and networking stack as well as avoiding memory copies entirely. Those valuable CPU cycles are then consumed directly by the actual Spark workloads, and help reducing the job runtime significantly. 
> This performance gain is demonstrated with both industry standard HiBench TeraSort (shows 1.5x speedup in sorting) as well as shuffle intensive customer applications. 
> SparkRDMA will be presented at Spark Summit 2017 in Dublin ([https://spark-summit.org/eu-2017/events/accelerating-shuffle-a-tailor-made-rdma-solution-for-apache-spark/]).
> Please see attached proposal document for more information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org