You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by smola <gi...@git.apache.org> on 2014/12/09 18:45:25 UTC

[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

GitHub user smola opened a pull request:

    https://github.com/apache/spark/pull/3645

    [SPARK-4799] Use IP address instead of local hostname in ConnectionManager

    See https://issues.apache.org/jira/browse/SPARK-4799
    
    
    
    Spark fails when a node hostname is not resolvable by other nodes.
    
    See an example trace:
    
    ```
    14/12/09 17:02:41 ERROR SendingConnection: Error connecting to 27e434cf36ac:35093
    java.nio.channels.UnresolvedAddressException
    	at sun.nio.ch.Net.checkAddress(Net.java:127)
    	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644)
    	at org.apache.spark.network.SendingConnection.connect(Connection.scala:299)
    	at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:278)
    	at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
    ```
    
    The relevant code is here:
    https://github.com/apache/spark/blob/bcb5cdad614d4fce43725dfec3ce88172d2f8c11/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala#L170
    
    ```
    val id = new ConnectionManagerId(Utils.localHostName, serverChannel.socket.getLocalPort)
    ```
    
    This piece of code should use the host IP with Utils.localIpAddress or a method that acknowleges user settings (e.g. `SPARK_LOCAL_IP`). Since I cannot think about a use case for using hostname here, I'm creating a PR with the former solution, but if you think the later is better, I'm willing to create a new PR with a more elaborate fix.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/smola/spark SPARK-4799

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3645.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3645
    
----
commit 45d0356241b852c575fb3b47a20c072c6ef3d7bf
Author: Santiago M. Mola <sm...@stratio.com>
Date:   2014-11-25T15:02:14Z

    [SPARK-4799] Use IP address instead of local hostname in ConnectionManager.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by smola <gi...@git.apache.org>.
Github user smola commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-69654075
  
    @pwendell Thanks! #3893 is good for me. I'm closing this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by nikonyrh <gi...@git.apache.org>.
Github user nikonyrh commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-142319031
  
    Hi, I would like to understand why slaves use SPARK_LOCAL_IP instead of SPARK_PUBLIC_DNS when talking to other slaves? It is also shown on "Address" column of Workers table on Spark Master UI.
    
    Earlier I thought that I could have SPARK_LOCAL_IP be the docker container's ip and SPARK_PUBLIC_DNS match host's ip, and with port forwarding other nodes and the driver are able to talk to slaves run inside containers. In this context the internal address isn't accessible except from the container's host OS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68030262
  
      [Test build #24766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24766/consoleFull) for   PR 3645 at commit [`45d0356`](https://github.com/apache/spark/commit/45d0356241b852c575fb3b47a20c072c6ef3d7bf).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68933786
  
    @smola hey - currently `Utils.localHostName` should respect SPARK_LOCAL_IP if it is set (it will try to find the associated interface). It will do a reverse lookup and find the associated hostname. Could you describe the network set-up on your machines? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68034273
  
      [Test build #24766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24766/consoleFull) for   PR 3645 at commit [`45d0356`](https://github.com/apache/spark/commit/45d0356241b852c575fb3b47a20c072c6ef3d7bf).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68030188
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by smola <gi...@git.apache.org>.
Github user smola closed the pull request at:

    https://github.com/apache/spark/pull/3645


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by smola <gi...@git.apache.org>.
Github user smola commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-69569760
  
    @pwendell Right. The problem is that there is no way to force the use of a given IP (ignoring reverse lookups or any other hostname/ip detection mechanisms).
    
    I get this on Docker, the default set up is something like this:
    
    Spark worker:
    - IP: 172.17.0.11
    - Hostname: hashone
    
    Spark driver:
    - IP: 172.17.0.12
    - Hostname: hashtwo
    
    Spark worker cannot resolve `hashtwo` and Spark driver cannot resolve `hashone`. At some point, Spark worker throws an exception because it's trying to resolve `hashtwo` instead of just contacting `172.17.0.12`.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-66324983
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68034277
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24766/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-68030283
  
    /cc @rxin @aarondav, since this is NIO connection manager related.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-69635199
  
    BTW - I also created this JIRA to try and clean up the way we deal with binding and advertised hostnames in Spark:
    
    https://issues.apache.org/jira/browse/SPARK-5078



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3645#issuecomment-69635091
  
    Yeah we've also seen this issue in docker environments. There is an alternative solution we just merged that allows overriding the reverse DNS lookup - and in our deployment we just set it directly to the IP.
    
    https://github.com/apache/spark/pull/3893
    
    Is that sufficient for your use case? The benefit with #3893 is that it doesn't change default behavior in the way this patch does.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org