You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 05:37:24 UTC

[jira] [Resolved] (SPARK-4853) Automatically adjust the number of connections between two peers to achieve good performance

     [ https://issues.apache.org/jira/browse/SPARK-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-4853.
---------------------------------
    Resolution: Incomplete

> Automatically adjust the number of connections between two peers to achieve good performance
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4853
>                 URL: https://issues.apache.org/jira/browse/SPARK-4853
>             Project: Spark
>          Issue Type: New Feature
>          Components: Shuffle
>    Affects Versions: 1.2.0
>            Reporter: Reynold Xin
>            Priority: Major
>              Labels: bulk-closed
>
> As discovered in SPARK-4740, performance of the new Netty transport can be impacted by the total number of active connections. This manifests itself when the following 3 conditions are true:
> (1) # spinning disks per node is large (doesn't affect SSDs)
> (2) # cores/node is large
> (3) # nodes is small
> In 1.2, we created a new config variable spark.shuffle.io.numConnectionsPerPeer that allows users to explicitly increase the number of connections between any two nodes. Ideally, we should have Spark automatically figure out the optimal (or near optimal) setting is so users don't have to worry about this config option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org