You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:20:13 UTC

[jira] [Updated] (SPARK-13960) JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no "spark.fileserver.host" option

     [ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-13960:
---------------------------------
    Labels: bulk-closed  (was: )

> JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no "spark.fileserver.host" option
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13960
>                 URL: https://issues.apache.org/jira/browse/SPARK-13960
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Spark Submit
>    Affects Versions: 1.6.1
>         Environment: Any system with more than one IP address
>            Reporter: Ilya Ostrovskiy
>            Priority: Major
>              Labels: bulk-closed
>
> There is no option to specify which hostname/IP address the jar/file server listens on, and rather than using "spark.driver.host" if specified, the jar/file server will listen on the system's primary IP address. This is an issue when submitting an application in client mode on a machine with two NICs connected to two different networks. 
> Steps to reproduce:
> 1) Have a cluster in a remote network, whose master is on 192.168.255.10
> 2) Have a machine at another location, with a "primary" IP address of 192.168.1.2, connected to the "remote network" as well, with the IP address 192.168.255.250. Let's call this the "client machine".
> 3) Ensure every machine in the spark cluster at the remote location can ping 192.168.255.250 and reach the client machine via that address.
> 4) On the client: 
> {noformat}
> spark-submit --deploy-mode client --conf "spark.driver.host=192.168.255.250" --master spark://192.168.255.10:7077 --class <any valid spark application> <local jar with spark application> <whatever args you want>
> {noformat}
> 5) Navigate to http://192.168.255.250:4040/ and ensure that executors from the remote cluster have found the driver on the client machine
> 6) Navigate to http://192.168.255.250:4040/environment/, and scroll to the bottom
> 7) Observe that the JAR you specified in Step 4 will be listed under http://192.168.1.2:<random port>/jars/<your jar here>.jar
> 8) Enjoy this stack trace periodically appearing on the client machine when the nodes in the remote cluster cant connect to 192.168.1.2 to get your JAR
> {noformat}
> 16/03/17 03:25:55 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5, 192.168.255.11): java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>         at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:589)
>         at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>         at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>         at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>         at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>         at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>         at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>         at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:588)
>         at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)
>         at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405)
>         at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397)
>         at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>         at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>         at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>         at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>         at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>         at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>         at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>         at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:397)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org