You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by tdelacour <td...@seas.upenn.edu> on 2016/02/23 20:28:27 UTC

Spark standalone peer2peer network

Some teammates and I are trying to create a spark cluster across ordinary
macbooks. We were wondering if there is any precedent or guide for doing
this, as our internet searches have not been particularly conclusive. So far
all attempts to use standalone mode have not worked. We suspect that this
has something to do with the difficulty of working with ipv4 and NAT.
Apologies for the lack of concrete questions, but this is a little out of
our depth.

Additional questions:
- Can anyone confirm that we need passwordless SSH set up between nodes in
the standalone cluster?
- Is ipv6 an option for this endeavor?

Any general direction would be very helpful!

Thanks in advance,
Thomas



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-standalone-peer2peer-network-tp26308.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark standalone peer2peer network

Posted by tdelacour <td...@seas.upenn.edu>.
Thank you for your quick reply!

we have been able to get the master running, and able to log in to the web
ui and start a slave on the same machine as the master. We also see the
slave appear on the master ui if the slave is running on the same computer.
However when we start a slave on a different machine and try connecting to
the master's spark://<host>:7077 url, it does not show up. log.txt
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n26311/log.txt>  
So we tried setting SPARK_MASTER_IP to the ip address of the master, which
gave use a new master url.

My teammates also sent me the attached logfile. Looking at it myself, the
hostname looks suspicious. It looks like this is a url that references a
locally running master, not one that can be accessed over the web. I might
have to try this out myself and get back to you. Is there a setting that can
dictate whether or not the cluster is run locally? My teammates say no, but,
again, I find this logfile to be sort of suspicious. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-standalone-peer2peer-network-tp26308p26311.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark standalone peer2peer network

Posted by Gourav Sengupta <go...@gmail.com>.
Hi,

Setting password less ssh access to your laptop may be a personal risk. I
would suppose that you can install Ubuntu over Virtualbox and set the
networking option to Bridged so that there are no issues.

For setting passwordless ssh see the following options (source:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
)

user@ubuntu:~$ su - hduser hduser@ubuntu:~$ ssh-keygen -t rsa -P "" Generating
public/private rsa key pair. Enter file in which to save the key
(/home/hduser/.ssh/id_rsa): Created directory '/home/hduser/.ssh'. Your
identification has been saved in /home/hduser/.ssh/id_rsa. Your public key
has been saved in /home/hduser/.ssh/id_rsa.pub. The key fingerprint
is: 9b:82:ea:58:b4:e0:35:d7:ff:19:66:a6:ef:ae:0e:d2
hduser@ubuntu The key's randomart image is: [...snipp...] hduser@ubuntu:~$

After this you will need to use the options here in order to set up a spark
cluster http://spark.apache.org/docs/latest/spark-standalone.html so that
you have a single master node and several slaves connecting to it.

There is only one word of caution though, you will find that you are not
using all the clusters in case the file path that you mention exists or is
available only in one system.


Regards,
Gourav Sengupta

On Tue, Feb 23, 2016 at 8:39 PM, Robineast <Ro...@xense.co.uk> wrote:

> Hi Thomas
>
> I can confirm that I have had this working in the past. I'm pretty sure you
> don't need password-less SSH for running a standalone cluster manually. Try
> running the instructions at
> http://spark.apache.org/docs/latest/spark-standalone.html for Starting a
> Cluster manually.
>
> do you get the master running and are you able to log in to the web ui?
> Get the spark://<host>:7077 url and start a slave on the same machine as
> the
> master. Do you see the slave appear in the master web ui? If so can you run
> spark-shell by connecting to the master?
>
> Now start slave on another machine. Do you see the new slave in the master
> web ui?
>
>
>
>
>
> -----
> Robin East
> Spark GraphX in Action Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-standalone-peer2peer-network-tp26308p26309.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Spark standalone peer2peer network

Posted by Robineast <Ro...@xense.co.uk>.
Hi Thomas

I can confirm that I have had this working in the past. I'm pretty sure you
don't need password-less SSH for running a standalone cluster manually. Try
running the instructions at
http://spark.apache.org/docs/latest/spark-standalone.html for Starting a
Cluster manually.

do you get the master running and are you able to log in to the web ui?
Get the spark://<host>:7077 url and start a slave on the same machine as the
master. Do you see the slave appear in the master web ui? If so can you run
spark-shell by connecting to the master?

Now start slave on another machine. Do you see the new slave in the master
web ui?





-----
Robin East 
Spark GraphX in Action Michael Malak and Robin East 
Manning Publications Co. 
http://www.manning.com/books/spark-graphx-in-action

--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-standalone-peer2peer-network-tp26308p26309.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org