You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by bhushansc007 <bh...@gmail.com> on 2015/03/31 18:56:01 UTC

How to setup a Spark Cluter?

Hi All,

I am quite new to Spark. So please pardon me if it is a very basic question. 

I have setup a Hadoop cluster using Hortonwork's Ambari. It has 1 Master and
3 Worker nodes. Currently, it has HDFS, Yarn, MapReduce2, HBase and
ZooKeeper services installed. 

Now, I want to install Spark on it. How do I do that? I searched a lot
online, but there is no clear step-by-step installation guide to do that.
All I find is the standalone setup guides. Can someone provide steps? What
needs to be copied to each machine? Where and what config changes should be
made on each machine?

Thanks.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-setup-a-Spark-Cluter-tp22326.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: How to setup a Spark Cluter?

Posted by amghost <zh...@gmail.com>.
First of all, you should download a pre-build version for Hadoop of Spark,
according to the version of Hadoop you are using,

Next, you may find useful infos in Spark docs:
http://spark.apache.org/docs/latest/running-on-yarn.html

There are some content about deploying in cluster mode in Spark's
programming guide, they may be useful too.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-setup-a-Spark-Cluter-tp22326p22350.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: How to setup a Spark Cluter?

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Its pretty simple, pick one machine as master (say machine A), and lets
call the workers are B,C, and D

*Login to A:*

- Enable passwd less authentication (ssh-keygen)
   - Add A's ~/.ssh/id_rsa.pub to B,C,D's ~/.ssh/authorized_keys file

- Download spark binary (that supports your hadoop version) from
https://spark.apache.org/downloads.html (eg: wget
http://d3kbcqa49mib13.cloudfront.net/spark-1.3.0-bin-hadoop2.4.tgz)
- Extract it (tar xf spark*tgz)
- cd spark-1.3.0-bin-hadoop2.4;cp conf/spark-env.sh.template
conf/spark-env.sh
- vi conf/spark-env.sh : Now configure SPARK_MASTER_IP, SPARK_WORKER_CORES,
SPARK_WORKER_MEMORY as the resources you have.
- vi conf/slaves : Add B,C,D hostnames/ipaddress line by line

- cd ../;
- rsync -za spark-1.3.0-bin-hadoop2.4 B:
- rsync -za spark-1.3.0-bin-hadoop2.4 C:
- rsync -za spark-1.3.0-bin-hadoop2.4 D:
- cd spark-1.3.0-bin-hadoop2.4;sbin/start-all.sh

Now your cluster is up and running, just be careful with your firewall
entries. If you open up all ports then anyone can take over your cluster.
:) Read more : https://www.sigmoid.com/securing-apache-spark-cluster/





Thanks
Best Regards

On Tue, Mar 31, 2015 at 10:26 PM, bhushansc007 <bh...@gmail.com>
wrote:

> Hi All,
>
> I am quite new to Spark. So please pardon me if it is a very basic
> question.
>
> I have setup a Hadoop cluster using Hortonwork's Ambari. It has 1 Master
> and
> 3 Worker nodes. Currently, it has HDFS, Yarn, MapReduce2, HBase and
> ZooKeeper services installed.
>
> Now, I want to install Spark on it. How do I do that? I searched a lot
> online, but there is no clear step-by-step installation guide to do that.
> All I find is the standalone setup guides. Can someone provide steps? What
> needs to be copied to each machine? Where and what config changes should be
> made on each machine?
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-setup-a-Spark-Cluter-tp22326.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>