You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Andrew Ash <an...@andrewash.com> on 2014/06/09 08:26:25 UTC
Re: Comprehensive Port Configuration reference?
Hi Jacob,
The port configuration docs that we worked on together are now available
at:
http://spark.apache.org/docs/latest/spark-standalone.html#configuring-ports-for-network-security
Thanks for the help!
Andrew
On Wed, May 28, 2014 at 3:21 PM, Jacob Eisinger <je...@us.ibm.com> wrote:
> Howdy Andrew,
>
> This is a standalone cluster. And, yes, if my understanding of Spark
> terminology is correct, you are correct about the port ownerships.
>
>
> Jacob
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> jeising@us.ibm.com - (512) 286-6075
>
> [image: Inactive hide details for Andrew Ash ---05/28/2014 05:18:46
> PM---Hmm, those do look like 4 listening ports to me. PID 3404 is]Andrew
> Ash ---05/28/2014 05:18:46 PM---Hmm, those do look like 4 listening ports
> to me. PID 3404 is an executor and PID 4762 is a worker?
>
>
> From: Andrew Ash <an...@andrewash.com>
> To: user@spark.apache.org
> Date: 05/28/2014 05:18 PM
>
> Subject: Re: Comprehensive Port Configuration reference?
> ------------------------------
>
>
>
> Hmm, those do look like 4 listening ports to me. PID 3404 is an executor
> and PID 4762 is a worker? This is a standalone cluster?
>
>
> On Wed, May 28, 2014 at 8:22 AM, Jacob Eisinger <*jeising@us.ibm.com*
> <je...@us.ibm.com>> wrote:
>
> Howdy Andrew,
>
> Here is what I ran before an application context was created (other
> services have been deleted):
>
> *# netstat -l -t tcp -p --numeric-ports
> *
> Active Internet connections (only servers)
>
> Proto Recv-Q Send-Q Local Address Foreign Address
> State PID/Program name
> * tcp6 0 0 **10.90.17.100:8888* <http://10.90.17.100:8888/>
> * :::* LISTEN 4762/java
> tcp6 0 0 :::8081 :::*
> LISTEN 4762/java *
>
> And, then while the application context is up:
> *# netstat -l -t tcp -p --numeric-ports
> *
> Active Internet connections (only servers)
>
> Proto Recv-Q Send-Q Local Address Foreign Address
> State PID/Program name
> * tcp6 0 0 **10.90.17.100:8888* <http://10.90.17.100:8888/>*
> :::* LISTEN 4762/java
> *
>
> * tcp6 0 0 :::57286 :::*
> LISTEN 3404/java tcp6 0 0 *
> *10.90.17.100:38118* <http://10.90.17.100:38118/>
> * :::* LISTEN 3404/java
> tcp6 0 0 **10.90.17.100:35530*
> <http://10.90.17.100:35530/>
> * :::* LISTEN 3404/java
> tcp6 0 0 :::60235 :::*
> LISTEN 3404/java *
> * tcp6 0 0 :::8081 :::*
> LISTEN 4762/java *
>
> My understanding is that this says four ports are open. Is 57286 and
> 60235 not being used?
>
>
> Jacob
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> *jeising@us.ibm.com* <je...@us.ibm.com> - *(512) 286-6075*
> <%28512%29%20286-6075>
>
> [image: Inactive hide details for Andrew Ash ---05/25/2014 06:25:18
> PM---Hi Jacob, The config option spark.history.ui.port is new for 1]Andrew
> Ash ---05/25/2014 06:25:18 PM---Hi Jacob, The config option
> spark.history.ui.port is new for 1.0 The problem that
>
>
> From: Andrew Ash <*andrew@andrewash.com* <an...@andrewash.com>>
> To: *user@spark.apache.org* <us...@spark.apache.org>
> Date: 05/25/2014 06:25 PM
>
> Subject: Re: Comprehensive Port Configuration reference?
> ------------------------------
>
>
>
> Hi Jacob,
>
> The config option spark.history.ui.port is new for 1.0 The problem
> that History server solves is that in non-Standalone cluster deployment
> modes (Mesos and YARN) there is no long-lived Spark Master that can store
> logs and statistics about an application after it finishes. History server
> is the UI that renders logged data from applications after they complete.
>
> Read more here: *https://issues.apache.org/jira/browse/SPARK-1276*
> <https://issues.apache.org/jira/browse/SPARK-1276> and
> *https://github.com/apache/spark/pull/204*
> <https://github.com/apache/spark/pull/204>
>
> As far as the two vs four dynamic ports, are those all listening
> ports? I did observe 4 ports in use, but only two of them were listening.
> The other two were the random ports used for responses on outbound
> connections, the source port of the (srcIP, srcPort, dstIP, dstPort) tuple
> that uniquely identifies a TCP socket.
>
>
> *http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to*
> <http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to>
>
> Thanks for taking a look through!
>
> I also realized that I had a couple mistakes with the 0.9 to 1.0
> transition so appropriately documented those now as well in the updated PR.
>
> Cheers!
> Andrew
>
>
>
> On Fri, May 23, 2014 at 2:43 PM, Jacob Eisinger <*jeising@us.ibm.com*
> <je...@us.ibm.com>> wrote:
> Howdy Andrew,
>
> I noticed you have a configuration item that we were not aware of:
> spark.history.ui.port . Is that new for 1.0?
>
> Also, we noticed that the Workers and the Drivers were opening up
> four dynamic ports per application context. It looks like you were seeing
> two.
>
> Everything else looks like it aligns!
> Jacob
>
>
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> *jeising@us.ibm.com* <je...@us.ibm.com> - *(512) 286-6075*
> <%28512%29%20286-6075>
>
> [image: Inactive hide details for Andrew Ash ---05/23/2014 10:30:58
> AM---Hi everyone, I've also been interested in better understanding]Andrew
> Ash ---05/23/2014 10:30:58 AM---Hi everyone, I've also been interested in
> better understanding what ports are used where
>
> From: Andrew Ash <*andrew@andrewash.com* <an...@andrewash.com>>
> To: *user@spark.apache.org* <us...@spark.apache.org>
> Date: 05/23/2014 10:30 AM
> Subject: Re: Comprehensive Port Configuration reference?
>
> ------------------------------
>
>
>
> Hi everyone,
>
> I've also been interested in better understanding what ports are
> used where and the direction the network connections go. I've observed a
> running cluster and read through code, and came up with the below
> documentation addition.
>
> *https://github.com/apache/spark/pull/856*
> <https://github.com/apache/spark/pull/856>
>
> Scott and Jacob -- it sounds like you two have pulled together some
> of this yourselves for writing firewall rules. Would you mind taking a
> look at this pull request and confirming that it matches your observations?
> Wrong documentation is worse than no documentation, so I'd like to make
> sure this is right.
>
> Cheers,
> Andrew
>
>
> On Wed, May 7, 2014 at 10:19 AM, Mark Baker <*distobj@acm.org*
> <di...@acm.org>> wrote:
> On Tue, May 6, 2014 at 9:09 AM, Jacob Eisinger <
> *jeising@us.ibm.com* <je...@us.ibm.com>> wrote:
> > In a nut shell, Spark opens up a couple of well known ports.
> And,then the workers and the shell open up dynamic ports for each job.
> These dynamic ports make securing the Spark network difficult.
>
> Indeed.
>
> Judging by the frequency with which this topic arises, this is a
> concern for many (myself included).
>
> I couldn't find anything in JIRA about it, but I'm curious to
> know
> whether the Spark team considers this a problem in need of a fix?
>
> Mark.
>
>
>
>