You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Arseniy Tashoyan (JIRA)" <ji...@apache.org> on 2017/08/08 15:30:00 UTC

[jira] [Commented] (SPARK-21668) Ability to run driver programs within a container

    [ https://issues.apache.org/jira/browse/SPARK-21668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118468#comment-16118468 ] 

Arseniy Tashoyan commented on SPARK-21668:
------------------------------------------

I don't think it is a duplicate of [ SPARK-6680 ]. The referenced issue covers a very specific environment: all Docker containers are on the same machine, hence in the same bridged network.
This issue covers a more generic setup: a container with a driver program and a real Spark cluster.
The solution proposed in [ SPARK-6680 ] does not work for this case - specify --conf spark.driver.host=${SPARK_LOCAL_IP}. The process inside the container cannot bind to the IP address of the host machine.

> Ability to run driver programs within a container
> -------------------------------------------------
>
>                 Key: SPARK-21668
>                 URL: https://issues.apache.org/jira/browse/SPARK-21668
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.1.1, 2.2.0
>            Reporter: Arseniy Tashoyan
>            Priority: Minor
>              Labels: containers, docker, driver, spark-submit, standalone
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When a driver program in Client mode runs in a Docker container, it binds to the IP address of the container, not the host machine. This container IP address is accessible only within the host machine, it is inaccessible for master and worker nodes.
> For example, the host machine has IP address 192.168.216.10. When Docker machine starts a container, it places it to a special bridged network and assigns it an IP address like 172.17.0.2. All Spark nodes belonging to the 192.168.216.0 network cannot access the bridged network with the container. Therefore, the driver program is not able to communicate with the Spark cluster.
> Spark already provides SPARK_PUBLIC_DNS environment variable for this purpose. However, in this scenario setting SPARK_PUBLIC_DNS to the host machine IP address does not work.
> Topic on StackOverflow: [https://stackoverflow.com/questions/45489248/running-spark-driver-program-in-docker-container-no-connection-back-from-execu]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org