You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2019/08/24 01:35:00 UTC

[jira] [Resolved] (SPARK-28778) Shuffle jobs fail due to incorrect advertised address when running in virtual network

     [ https://issues.apache.org/jira/browse/SPARK-28778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun resolved SPARK-28778.
-----------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

This is resolved via https://github.com/apache/spark/pull/25500

> Shuffle jobs fail due to incorrect advertised address when running in virtual network
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-28778
>                 URL: https://issues.apache.org/jira/browse/SPARK-28778
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.2.3, 2.3.0, 2.4.3
>            Reporter: Anton Kirillov
>            Priority: Major
>              Labels: Mesos
>             Fix For: 3.0.0
>
>
> When shuffle jobs are launched by Mesos in a virtual network, Mesos scheduler sets executor {{--hostname}} parameter to {{0.0.0.0}} in the case when {{spark.mesos.network.name}} is provided. This makes executors use {{0.0.0.0}} as their advertised address and, in the presence of shuffle, executors fail to fetch shuffle blocks from each other using {{0.0.0.0}} as the origin. When a virtual network is used the hostname or IP address is not known upfront and assigned to a container at its start time so the executor process needs to advertise the correct dynamically assigned address to be reachable by other executors.
> h3.  
> The bug described above prevents Mesos users from running any jobs which involve shuffle due to the inability of executors to fetch shuffle blocks because of incorrect advertised address when virtual network is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org