You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Soumya Simanta <so...@gmail.com> on 2014/02/05 16:19:42 UTC

Problem connecting to Spark Cluster from a standalone Scala program

I'm running a Spark cluster. (Spark-0.9.0_SNAPSHOT).

I connect to the Spark cluster from the spark-shell.  I can see the Spark
web UI on n001:8080 and it shows that the master is running on
spark://n001:7077


However, when I try to connect to it using a standalone Scala program but
I'm getting an exception which says that I cannot connect to the


This is how I'm creating the Spark context in my Scala program.

                val sc = new SparkContext(*"spark://n001:7077"*, "Simple
Twitter Analysis",
"/home/myuserid/incubator-spark",List("target/scala-2.10/simple-project_2.10-1.0.jar"))
And this is exception I'm getting.

14/02/05 10:05:20 INFO scheduler.DAGScheduler: Submitting Stage 1
(MapPartitionsRDD[4] at reduceByKey at SimpleApp.scala:14), which has no
missing parents

14/02/05 10:05:20 INFO scheduler.DAGScheduler: Submitting 2 missing tasks
from Stage 1 (MapPartitionsRDD[4] at reduceByKey at SimpleApp.scala:14)

14/02/05 10:05:20 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0
with 2 tasks

14/02/05 10:05:35 WARN scheduler.TaskSchedulerImpl: Initial job has not
accepted any resources; check your cluster UI to ensure that workers are
registered and have sufficient memory

14/02/05 10:05:39 INFO client.AppClient$ClientActor: Connecting to master
spark://n001:7077...

*14/02/05 10:05:50 WARN scheduler.TaskSchedulerImpl: Initial job has not
accepted any resources; check your cluster UI to ensure that workers are
registered and have sufficient memory*

14/02/05 10:05:59 INFO client.AppClient$ClientActor: Connecting to master
spark://n001:7077...

14/02/05 10:06:05 WARN scheduler.TaskSchedulerImpl: Initial job has not
accepted any resources; check your cluster UI to ensure that workers are
registered and have sufficient memory

*14/02/05 10:06:19 ERROR client.AppClient$ClientActor: All masters are
unresponsive! Giving up.*

*14/02/05 10:06:19 ERROR cluster.SparkDeploySchedulerBackend: Spark cluster
looks dead, giving up.*

14/02/05 10:06:19 INFO scheduler.TaskSchedulerImpl: Remove TaskSet 1.0 from
pool

*14/02/05 10:06:19 INFO scheduler.DAGScheduler: Failed to run count at
SimpleApp.scala:16*

*[error] (run-main) org.apache.spark.SparkException: Job aborted: Spark
cluster looks down*

*org.apache.spark.SparkException: Job aborted: Spark cluster looks down*

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)

at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)

at scala.Option.foreach(Option.scala:236)

at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

at akka.actor.ActorCell.invoke(ActorCell.scala:456)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)

at akka.dispatch.Mailbox.run(Mailbox.scala:219)

at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:262)

at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1478)

at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

[trace] Stack trace suppressed: run last compile:run for the full output.

14/02/05 10:06:19 INFO network.ConnectionManager: Selector thread was
interrupted!

java.lang.RuntimeException: Nonzero exit code: 1

at scala.sys.package$.error(package.scala:27)

[trace] Stack trace suppressed: run last compile:run for the full output.

[error] (compile:run) Nonzero exit code: 1

[error] Total time: 63 s, completed Feb 5, 2014 10:06:19 AM

Re: Problem connecting to Spark Cluster from a standalone Scala program

Posted by Soumya Simanta <so...@gmail.com>.
Any idea why I'm getting this error in my

logs/spark-ssimanta-org.apache.spark.deploy.master.Master-1-xxx.xxx.xxx.xxx.out

java.io.OptionalDataException

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1369)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
scala.collection.immutable.$colon$colon.readObject(List.scala:366)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:622)

        at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
scala.collection.immutable.$colon$colon.readObject(List.scala:366)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:622)

        at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
scala.collection.immutable.$colon$colon.readObject(List.scala:366)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:622)

        at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
scala.collection.immutable.$colon$colon.readObject(List.scala:366)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:622)

        at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892)

       at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
scala.collection.immutable.$colon$colon.readObject(List.scala:366)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:622)

        at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1914)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1914)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1914)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

        at
akka.serialization.JavaSerializer$anonfun$1.apply(Serializer.scala:136)

        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)

        at
akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)

        at
akka.serialization.Serialization$anonfun$deserialize$1.apply(Serialization.scala:104)

        at scala.util.Try$.apply(Try.scala:161)

        at
akka.serialization.Serialization.deserialize(Serialization.scala:98)

        at
akka.remote.serialization.MessageContainerSerializer.fromBinary(MessageContainerSerializer.scala:58)

        at
akka.serialization.Serialization$anonfun$deserialize$1.apply(Serialization.scala:104)

        at scala.util.Try$.apply(Try.scala:161)

        at
akka.serialization.Serialization.deserialize(Serialization.scala:98)

        at
akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)

        at
akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:55)

        at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:55)

        at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:73)

        at
akka.remote.EndpointReader$anonfun$receive$2.applyOrElse(Endpoint.scala:764)

        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

        at akka.actor.ActorCell.invoke(ActorCell.scala:456)

        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)

        at akka.dispatch.Mailbox.run(Mailbox.scala:219)

        at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)

        at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

        at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

        at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

        at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

and then I get a bunch of the following in my master log as well.

akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@n001.x.y.z:45329]

Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: n001.x.y.z/xx.xx.xx.xx:45329

]


On Wed, Feb 5, 2014 at 3:56 PM, Soumya Simanta <so...@gmail.com>wrote:

> I've tried doing both in my sparkcontext and it still does't work.
>
>
> On Wed, Feb 5, 2014 at 3:49 PM, Andrew Ash <an...@andrewash.com> wrote:
>
>> The URL in your screenshot is a full hostname, and doesn't match what you
>> typed earlier.  Try using spark://n0001.etc.sei.cmu.edu:7077 rather than
>> just spark://n0001:7077 -- the underlying akka library is really sensitive
>> to exact matches on this spark URL.
>>
>>
>> On Wed, Feb 5, 2014 at 12:39 PM, Soumya Simanta <soumya.simanta@gmail.com
>> > wrote:
>>
>>>
>>>
>>>
>>> On Wed, Feb 5, 2014 at 3:08 PM, Andrew Ash <an...@andrewash.com> wrote:
>>>
>>>> When you look in the webui (port 8080) for the master does it list at
>>>> least one connected worker?
>>>>
>>>>
>>> Yeah I see all my four workers and their state as active. Please see the
>>> attached screen shot also.
>>>
>>> [image: Inline image 1]
>>>
>>>
>>
>

Re: Problem connecting to Spark Cluster from a standalone Scala program

Posted by Soumya Simanta <so...@gmail.com>.
I've tried doing both in my sparkcontext and it still does't work.


On Wed, Feb 5, 2014 at 3:49 PM, Andrew Ash <an...@andrewash.com> wrote:

> The URL in your screenshot is a full hostname, and doesn't match what you
> typed earlier.  Try using spark://n0001.etc.sei.cmu.edu:7077 rather than
> just spark://n0001:7077 -- the underlying akka library is really sensitive
> to exact matches on this spark URL.
>
>
> On Wed, Feb 5, 2014 at 12:39 PM, Soumya Simanta <so...@gmail.com>wrote:
>
>>
>>
>>
>> On Wed, Feb 5, 2014 at 3:08 PM, Andrew Ash <an...@andrewash.com> wrote:
>>
>>> When you look in the webui (port 8080) for the master does it list at
>>> least one connected worker?
>>>
>>>
>> Yeah I see all my four workers and their state as active. Please see the
>> attached screen shot also.
>>
>> [image: Inline image 1]
>>
>>
>

Re: Problem connecting to Spark Cluster from a standalone Scala program

Posted by Andrew Ash <an...@andrewash.com>.
The URL in your screenshot is a full hostname, and doesn't match what you
typed earlier.  Try using spark://n0001.etc.sei.cmu.edu:7077 rather than
just spark://n0001:7077 -- the underlying akka library is really sensitive
to exact matches on this spark URL.


On Wed, Feb 5, 2014 at 12:39 PM, Soumya Simanta <so...@gmail.com>wrote:

>
>
>
> On Wed, Feb 5, 2014 at 3:08 PM, Andrew Ash <an...@andrewash.com> wrote:
>
>> When you look in the webui (port 8080) for the master does it list at
>> least one connected worker?
>>
>>
> Yeah I see all my four workers and their state as active. Please see the
> attached screen shot also.
>
> [image: Inline image 1]
>
>

Re: Problem connecting to Spark Cluster from a standalone Scala program

Posted by Soumya Simanta <so...@gmail.com>.
On Wed, Feb 5, 2014 at 3:08 PM, Andrew Ash <an...@andrewash.com> wrote:

> When you look in the webui (port 8080) for the master does it list at
> least one connected worker?
>
>
Yeah I see all my four workers and their state as active. Please see the
attached screen shot also.

[image: Inline image 1]

Re: Problem connecting to Spark Cluster from a standalone Scala program

Posted by Andrew Ash <an...@andrewash.com>.
When you look in the webui (port 8080) for the master does it list at least
one connected worker?


On Wed, Feb 5, 2014 at 7:19 AM, Soumya Simanta <so...@gmail.com>wrote:

> I'm running a Spark cluster. (Spark-0.9.0_SNAPSHOT).
>
> I connect to the Spark cluster from the spark-shell.  I can see the Spark
> web UI on n001:8080 and it shows that the master is running on
> spark://n001:7077
>
>
> However, when I try to connect to it using a standalone Scala program but
> I'm getting an exception which says that I cannot connect to the
>
>
> This is how I'm creating the Spark context in my Scala program.
>
>                 val sc = new SparkContext(*"spark://n001:7077"*, "Simple
> Twitter Analysis",
> "/home/myuserid/incubator-spark",List("target/scala-2.10/simple-project_2.10-1.0.jar"))
> And this is exception I'm getting.
>
> 14/02/05 10:05:20 INFO scheduler.DAGScheduler: Submitting Stage 1
> (MapPartitionsRDD[4] at reduceByKey at SimpleApp.scala:14), which has no
> missing parents
>
> 14/02/05 10:05:20 INFO scheduler.DAGScheduler: Submitting 2 missing tasks
> from Stage 1 (MapPartitionsRDD[4] at reduceByKey at SimpleApp.scala:14)
>
> 14/02/05 10:05:20 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0
> with 2 tasks
>
> 14/02/05 10:05:35 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory
>
> 14/02/05 10:05:39 INFO client.AppClient$ClientActor: Connecting to master
> spark://n001:7077...
>
> *14/02/05 10:05:50 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory*
>
> 14/02/05 10:05:59 INFO client.AppClient$ClientActor: Connecting to master
> spark://n001:7077...
>
> 14/02/05 10:06:05 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory
>
> *14/02/05 10:06:19 ERROR client.AppClient$ClientActor: All masters are
> unresponsive! Giving up.*
>
> *14/02/05 10:06:19 ERROR cluster.SparkDeploySchedulerBackend: Spark
> cluster looks dead, giving up.*
>
> 14/02/05 10:06:19 INFO scheduler.TaskSchedulerImpl: Remove TaskSet 1.0
> from pool
>
> *14/02/05 10:06:19 INFO scheduler.DAGScheduler: Failed to run count at
> SimpleApp.scala:16*
>
> *[error] (run-main) org.apache.spark.SparkException: Job aborted: Spark
> cluster looks down*
>
> *org.apache.spark.SparkException: Job aborted: Spark cluster looks down*
>
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
>
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)
>
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>
> at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026)
>
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)
>
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)
>
> at scala.Option.foreach(Option.scala:236)
>
> at
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619)
>
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207)
>
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>
> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>
> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:262)
>
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
>
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1478)
>
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
>
> [trace] Stack trace suppressed: run last compile:run for the full output.
>
> 14/02/05 10:06:19 INFO network.ConnectionManager: Selector thread was
> interrupted!
>
> java.lang.RuntimeException: Nonzero exit code: 1
>
> at scala.sys.package$.error(package.scala:27)
>
> [trace] Stack trace suppressed: run last compile:run for the full output.
>
> [error] (compile:run) Nonzero exit code: 1
>
> [error] Total time: 63 s, completed Feb 5, 2014 10:06:19 AM
>
>
>
>
>
>
>
>
>
>
>
>
>