You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com> on 2018/03/09 03:09:58 UTC
Spark production scenario
Hi all,
We are going to move to production with an 8 node Spark cluster. Request
some help for below
We are running on YARN cluster manager.That means YARN is installed with
SSH between the nodes. When we run a standalone Spark program with
spark-submit, YARN initializes a resource manager followed by application
master per application. This is allocated randomely with arbitrary port.
So, would we be opening all ports in between the nodes in a production
implementation ?
Best,
Passion
Re: Spark production scenario
Posted by yncxcw <yn...@gmail.com>.
hi, Passion
I don't know an exact solution. But yes, the port each executor chosen to
communicate with driver is random. I am wondering if it's possible that you
can have a node has two ethernet card, configure one card for intranet for
Spark and configure one card for WAN. Then connect the rests nodes using the
intranet.
And also, I think you might not use WAN for Spark data transfer since the
amount of data during shuffle is huge. You got to have a high-speed switch
for your cluster.
Hopes this answer can help you!
Wei
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org