You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Manoj Samel <ma...@gmail.com> on 2014/01/21 06:02:27 UTC

RDD action hangs on a standalone mode cluster

Hi,

I configured spark 0.8.1 cluster on AWS with one master node and 3 worker
nodes. The cluster was configured as a standalone cluster using
http://spark.incubator.apache.org/docs/latest/spark-standalone.html

The distribution was generated
the master node was started on master host with ./bin/start-master.sh
Then on each of the worker nodes, I did a cd spark-distro directory and did
./spark-class org.apache.spark.deploy.worker.Worker spark://IPxxxx:7077

In the browser, on master 8080 port, I can see the 3 worker nodes ALIVE

Next I start a spark shell on master node with
MASTER=spark://IPxxx:7077 ./spark-shell.

In it I create a simple RDD on a local text file with few lines and do
countByKey(). The shell hangs. Doing ctrl-C gives

scala> credit.countByKey()
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:318)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:840)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:909)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:654)
at org.apache.spark.rdd.RDD.countByValue(RDD.scala:752)
at
org.apache.spark.rdd.PairRDDFunctions.countByKey(PairRDDFunctions.scala:198)

Note - the same works in a local shell (without master).

Any pointers? Do I have to set any other network/logins? Note I am *** NOT
*** starting slaves from the master node (using bin/start-slaves.sh) and
thus have not set passwordless ssh login etc.

Re: RDD action hangs on a standalone mode cluster

Posted by Manoj Samel <ma...@gmail.com>.
Missed to mention that the distribution was generated on master by
running make-distribution.sh and then the dist directory was scp-ed to all
worker nodes. Thus the worker nodes only have the dist directory


On Mon, Jan 20, 2014 at 9:02 PM, Manoj Samel <ma...@gmail.com>wrote:

> Hi,
>
> I configured spark 0.8.1 cluster on AWS with one master node and 3 worker
> nodes. The cluster was configured as a standalone cluster using
> http://spark.incubator.apache.org/docs/latest/spark-standalone.html
>
> The distribution was generated
> the master node was started on master host with ./bin/start-master.sh
> Then on each of the worker nodes, I did a cd spark-distro directory and did
> ./spark-class org.apache.spark.deploy.worker.Worker spark://IPxxxx:7077
>
> In the browser, on master 8080 port, I can see the 3 worker nodes ALIVE
>
> Next I start a spark shell on master node with
> MASTER=spark://IPxxx:7077 ./spark-shell.
>
> In it I create a simple RDD on a local text file with few lines and do
> countByKey(). The shell hangs. Doing ctrl-C gives
>
> scala> credit.countByKey()
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
>  at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:318)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:840)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:909)
>  at org.apache.spark.rdd.RDD.reduce(RDD.scala:654)
> at org.apache.spark.rdd.RDD.countByValue(RDD.scala:752)
>  at
> org.apache.spark.rdd.PairRDDFunctions.countByKey(PairRDDFunctions.scala:198)
>
> Note - the same works in a local shell (without master).
>
> Any pointers? Do I have to set any other network/logins? Note I am *** NOT
> *** starting slaves from the master node (using bin/start-slaves.sh) and
> thus have not set passwordless ssh login etc.
>