You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jeff Higgens <je...@gmail.com> on 2014/01/02 00:36:23 UTC

Re: Running Spark jar on EC2

Thanks for the suggestions.

Unfortunately I am still unable to run my fat jar on EC2 (even using
runExample, and SPARK_CLASSPATH is blank). Here is the full output:

root@ip-172-31-21-60 ~]$ java -jar Crunch-assembly-0.0.1.jar
14/01/01 22:34:40 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
14/01/01 22:34:40 INFO spark.SparkEnv: Registering BlockManagerMaster
14/01/01 22:34:40 INFO storage.MemoryStore: MemoryStore started with
capacity 1093.6 MB.
14/01/01 22:34:41 INFO storage.DiskStore: Created local directory at
/tmp/spark-local-20140101223440-a6bb
14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port
56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)
14/01/01 22:34:41 INFO storage.BlockManagerMaster: Trying to register
BlockManager
14/01/01 22:34:41 INFO storage.BlockManagerMaster: Registered BlockManager
14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
14/01/01 22:34:41 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:46111
14/01/01 22:34:41 INFO broadcast.HttpBroadcast: Broadcast server started at
http://172.31.21.60:46111
14/01/01 22:34:41 INFO spark.SparkEnv: Registering MapOutputTracker
14/01/01 22:34:41 INFO spark.HttpFileServer: HTTP File server directory is
/tmp/spark-227ad744-5d0d-4e1a-aacd-9c0c73876b31
14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
14/01/01 22:34:41 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:44012
14/01/01 22:34:41 INFO io.IoWorker: IoWorker thread 'spray-io-worker-0'
started
14/01/01 22:34:41 INFO server.HttpServer:
akka://spark/user/BlockManagerHTTPServer started on /0.0.0.0:45098
14/01/01 22:34:41 INFO storage.BlockManagerUI: Started BlockManager web UI
at http://ip-172-31-21-60:45098
14/01/01 22:34:42 INFO spark.SparkContext: Added JAR
/root/Crunch-assembly-0.0.1.jar at
http://172.31.21.60:44012/jars/Crunch-assembly-0.0.1.jar with timestamp
1388615682294
14/01/01 22:34:42 INFO client.Client$ClientActor: Connecting to master
spark://ec2-54-193-16-137.us-west-1.compute.amazonaws.com:7077
14/01/01 22:34:42 ERROR client.Client$ClientActor: Connection to master
failed; stopping client
14/01/01 22:34:42 ERROR cluster.SparkDeploySchedulerBackend: Disconnected
from Spark cluster!
14/01/01 22:34:42 ERROR cluster.ClusterScheduler: Exiting due to error from
cluster scheduler: Disconnected from Spark cluster


Interestingly, running one of the examples (SparkPi) works fine. The only
thing that looked different from the output of SparkPi was this line:
14/01/01 23:27:55 INFO network.ConnectionManager: Bound socket to port
41806 with id =
ConnectionManagerId(ip-172-31-29-197.us-west-1.compute.internal,41806)

Whereas my (not working) jar looked like this on that line:
14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port
56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)


On Fri, Dec 20, 2013 at 8:54 PM, Evan Sparks <ev...@gmail.com> wrote:

> I ran into a similar issue a few months back - pay careful attention to
> the order in which spark decides to look for your jars. The root of my
> problem was a stale jar in SPARK_CLASSPATH on the worker nodes, which took
> precedence (IIRC) over jars passed in with the SparkContext constructor.
>
> On Dec 20, 2013, at 8:49 PM, "K. Shankari" <sh...@eecs.berkeley.edu>
> wrote:
>
> I don't think that you need to copy the jar to the rest of the cluster -
> you should be able to do addJar() in the SparkContext and spark should
> automatically push the jars to the client for you.
>
> I don't know how set you are on running code through checking out and
> compiling, but here's what I do instead to get my own application to run:
> - compile my code on my desktop and generate a jar
> - scp the jar to the master
> - modify runExample to include the jar in the classpath. I think that you
> can also just modify SPARK_CLASSPATH
> - run using something like:
>
> $ runExample my.class.name arg1 arg2 arg3
>
> Hope this helps!
> Shankari
>
>
> On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <je...@gmail.com> wrote:
>
>> I'm having trouble running my Spark program as a "fat jar" on EC2.
>>
>> This is the process I'm using:
>> (1) spark-ec2 script to launch cluster
>> (2) ssh to master, install sbt and git clone my project's source code
>> (3) update source to reference correct master and jar
>> (4) sbt assembly
>> (5) copy-dir to copy the jar to the rest of the cluster
>>
>> I tried both running the jar (java -jar ...) and using sbt run, but I
>> always end up with this error:
>>
>> 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO
>>  o.a.s.d.client.Client$ClientActor - Connecting to master spark://
>> ec2-50-16-80-0.compute-1.amazonaws.com:7077
>> 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR
>> o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping
>> client
>> 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR
>> o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
>> 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR
>> o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster
>> scheduler: Disconnected from Spark cluster
>> 18:58:59.844 [delete Spark local dirs] DEBUG
>> org.apache.spark.storage.DiskStore - Shutdown hook called
>>
>>
>> But when I use spark-shell it has no problems connecting to the master
>> using the exact same url:
>>
>> 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master
>> spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>> Spark context available as sc.
>>
>> I'm probably missing something obvious so any tips are very appreciated.
>>
>
>

Re: Running Spark jar on EC2

Posted by Berkeley Malagon <be...@firestickgames.com>.
Thanks for sharing the explanation. 

> On Jan 1, 2014, at 4:19 PM, Jeff Higgens <je...@gmail.com> wrote:
> 
> Ok, the problem was a very silly mistake.
> 
> I launched my EC2 instances using spark-0.8.1-incubating, but my fat jar was still being compiled with spark-0.7.3. Oops!
> 
> 
>> On Wed, Jan 1, 2014 at 3:36 PM, Jeff Higgens <je...@gmail.com> wrote:
>> Thanks for the suggestions.
>> 
>> Unfortunately I am still unable to run my fat jar on EC2 (even using runExample, and SPARK_CLASSPATH is blank). Here is the full output:
>> 
>> root@ip-172-31-21-60 ~]$ java -jar Crunch-assembly-0.0.1.jar 
>> 14/01/01 22:34:40 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
>> 14/01/01 22:34:40 INFO spark.SparkEnv: Registering BlockManagerMaster
>> 14/01/01 22:34:40 INFO storage.MemoryStore: MemoryStore started with capacity 1093.6 MB.
>> 14/01/01 22:34:41 INFO storage.DiskStore: Created local directory at /tmp/spark-local-20140101223440-a6bb
>> 14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port 56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)
>> 14/01/01 22:34:41 INFO storage.BlockManagerMaster: Trying to register BlockManager
>> 14/01/01 22:34:41 INFO storage.BlockManagerMaster: Registered BlockManager
>> 14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
>> 14/01/01 22:34:41 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:46111
>> 14/01/01 22:34:41 INFO broadcast.HttpBroadcast: Broadcast server started at http://172.31.21.60:46111
>> 14/01/01 22:34:41 INFO spark.SparkEnv: Registering MapOutputTracker
>> 14/01/01 22:34:41 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-227ad744-5d0d-4e1a-aacd-9c0c73876b31
>> 14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
>> 14/01/01 22:34:41 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:44012
>> 14/01/01 22:34:41 INFO io.IoWorker: IoWorker thread 'spray-io-worker-0' started
>> 14/01/01 22:34:41 INFO server.HttpServer: akka://spark/user/BlockManagerHTTPServer started on /0.0.0.0:45098
>> 14/01/01 22:34:41 INFO storage.BlockManagerUI: Started BlockManager web UI at http://ip-172-31-21-60:45098
>> 14/01/01 22:34:42 INFO spark.SparkContext: Added JAR /root/Crunch-assembly-0.0.1.jar at http://172.31.21.60:44012/jars/Crunch-assembly-0.0.1.jar with timestamp 1388615682294
>> 14/01/01 22:34:42 INFO client.Client$ClientActor: Connecting to master spark://ec2-54-193-16-137.us-west-1.compute.amazonaws.com:7077
>> 14/01/01 22:34:42 ERROR client.Client$ClientActor: Connection to master failed; stopping client
>> 14/01/01 22:34:42 ERROR cluster.SparkDeploySchedulerBackend: Disconnected from Spark cluster!
>> 14/01/01 22:34:42 ERROR cluster.ClusterScheduler: Exiting due to error from cluster scheduler: Disconnected from Spark cluster
>> 
>> 
>> Interestingly, running one of the examples (SparkPi) works fine. The only thing that looked different from the output of SparkPi was this line:
>> 14/01/01 23:27:55 INFO network.ConnectionManager: Bound socket to port 41806 with id = ConnectionManagerId(ip-172-31-29-197.us-west-1.compute.internal,41806)
>> 
>> Whereas my (not working) jar looked like this on that line:
>> 14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port 56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)
>> 
>> 
>>> On Fri, Dec 20, 2013 at 8:54 PM, Evan Sparks <ev...@gmail.com> wrote:
>>> I ran into a similar issue a few months back - pay careful attention to the order in which spark decides to look for your jars. The root of my problem was a stale jar in SPARK_CLASSPATH on the worker nodes, which took precedence (IIRC) over jars passed in with the SparkContext constructor. 
>>> 
>>>> On Dec 20, 2013, at 8:49 PM, "K. Shankari" <sh...@eecs.berkeley.edu> wrote:
>>>> 
>>>> I don't think that you need to copy the jar to the rest of the cluster - you should be able to do addJar() in the SparkContext and spark should automatically push the jars to the client for you.
>>>> 
>>>> I don't know how set you are on running code through checking out and compiling, but here's what I do instead to get my own application to run:
>>>> - compile my code on my desktop and generate a jar
>>>> - scp the jar to the master
>>>> - modify runExample to include the jar in the classpath. I think that you can also just modify SPARK_CLASSPATH
>>>> - run using something like:
>>>> 
>>>> $ runExample my.class.name arg1 arg2 arg3
>>>> 
>>>> Hope this helps!
>>>> Shankari
>>>> 
>>>> 
>>>>> On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <je...@gmail.com> wrote:
>>>>> I'm having trouble running my Spark program as a "fat jar" on EC2.
>>>>> 
>>>>> This is the process I'm using:
>>>>> (1) spark-ec2 script to launch cluster
>>>>> (2) ssh to master, install sbt and git clone my project's source code
>>>>> (3) update source to reference correct master and jar
>>>>> (4) sbt assembly
>>>>> (5) copy-dir to copy the jar to the rest of the cluster
>>>>> 
>>>>> I tried both running the jar (java -jar ...) and using sbt run, but I always end up with this error:
>>>>> 
>>>>> 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO  o.a.s.d.client.Client$ClientActor - Connecting to master spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>>>>> 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping client
>>>>> 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
>>>>> 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster scheduler: Disconnected from Spark cluster
>>>>> 18:58:59.844 [delete Spark local dirs] DEBUG org.apache.spark.storage.DiskStore - Shutdown hook called
>>>>> 
>>>>> 
>>>>> But when I use spark-shell it has no problems connecting to the master using the exact same url: 
>>>>> 
>>>>> 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>>>>> Spark context available as sc.
>>>>> 
>>>>> I'm probably missing something obvious so any tips are very appreciated.
> 

Re: Running Spark jar on EC2

Posted by Jeff Higgens <je...@gmail.com>.
Ok, the problem was a very silly mistake.

I launched my EC2 instances using spark-0.8.1-incubating, but my fat jar
was still being compiled with spark-0.7.3. Oops!


On Wed, Jan 1, 2014 at 3:36 PM, Jeff Higgens <je...@gmail.com> wrote:

> Thanks for the suggestions.
>
> Unfortunately I am still unable to run my fat jar on EC2 (even using
> runExample, and SPARK_CLASSPATH is blank). Here is the full output:
>
> root@ip-172-31-21-60 ~]$ java -jar Crunch-assembly-0.0.1.jar
> 14/01/01 22:34:40 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
> 14/01/01 22:34:40 INFO spark.SparkEnv: Registering BlockManagerMaster
> 14/01/01 22:34:40 INFO storage.MemoryStore: MemoryStore started with
> capacity 1093.6 MB.
> 14/01/01 22:34:41 INFO storage.DiskStore: Created local directory at
> /tmp/spark-local-20140101223440-a6bb
> 14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port
> 56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)
> 14/01/01 22:34:41 INFO storage.BlockManagerMaster: Trying to register
> BlockManager
> 14/01/01 22:34:41 INFO storage.BlockManagerMaster: Registered BlockManager
> 14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 14/01/01 22:34:41 INFO server.AbstractConnector: Started
> SocketConnector@0.0.0.0:46111
> 14/01/01 22:34:41 INFO broadcast.HttpBroadcast: Broadcast server started
> at http://172.31.21.60:46111
> 14/01/01 22:34:41 INFO spark.SparkEnv: Registering MapOutputTracker
> 14/01/01 22:34:41 INFO spark.HttpFileServer: HTTP File server directory is
> /tmp/spark-227ad744-5d0d-4e1a-aacd-9c0c73876b31
> 14/01/01 22:34:41 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 14/01/01 22:34:41 INFO server.AbstractConnector: Started
> SocketConnector@0.0.0.0:44012
> 14/01/01 22:34:41 INFO io.IoWorker: IoWorker thread 'spray-io-worker-0'
> started
> 14/01/01 22:34:41 INFO server.HttpServer:
> akka://spark/user/BlockManagerHTTPServer started on /0.0.0.0:45098
> 14/01/01 22:34:41 INFO storage.BlockManagerUI: Started BlockManager web UI
> at http://ip-172-31-21-60:45098
> 14/01/01 22:34:42 INFO spark.SparkContext: Added JAR
> /root/Crunch-assembly-0.0.1.jar at
> http://172.31.21.60:44012/jars/Crunch-assembly-0.0.1.jar with timestamp
> 1388615682294
> 14/01/01 22:34:42 INFO client.Client$ClientActor: Connecting to master
> spark://ec2-54-193-16-137.us-west-1.compute.amazonaws.com:7077
> 14/01/01 22:34:42 ERROR client.Client$ClientActor: Connection to master
> failed; stopping client
> 14/01/01 22:34:42 ERROR cluster.SparkDeploySchedulerBackend: Disconnected
> from Spark cluster!
> 14/01/01 22:34:42 ERROR cluster.ClusterScheduler: Exiting due to error
> from cluster scheduler: Disconnected from Spark cluster
>
>
> Interestingly, running one of the examples (SparkPi) works fine. The only
> thing that looked different from the output of SparkPi was this line:
> 14/01/01 23:27:55 INFO network.ConnectionManager: Bound socket to port
> 41806 with id =
> ConnectionManagerId(ip-172-31-29-197.us-west-1.compute.internal,41806)
>
> Whereas my (not working) jar looked like this on that line:
> 14/01/01 22:34:41 INFO network.ConnectionManager: Bound socket to port
> 56274 with id = ConnectionManagerId(ip-172-31-21-60,56274)
>
>
> On Fri, Dec 20, 2013 at 8:54 PM, Evan Sparks <ev...@gmail.com>wrote:
>
>> I ran into a similar issue a few months back - pay careful attention to
>> the order in which spark decides to look for your jars. The root of my
>> problem was a stale jar in SPARK_CLASSPATH on the worker nodes, which took
>> precedence (IIRC) over jars passed in with the SparkContext constructor.
>>
>> On Dec 20, 2013, at 8:49 PM, "K. Shankari" <sh...@eecs.berkeley.edu>
>> wrote:
>>
>> I don't think that you need to copy the jar to the rest of the cluster -
>> you should be able to do addJar() in the SparkContext and spark should
>> automatically push the jars to the client for you.
>>
>> I don't know how set you are on running code through checking out and
>> compiling, but here's what I do instead to get my own application to run:
>> - compile my code on my desktop and generate a jar
>> - scp the jar to the master
>> - modify runExample to include the jar in the classpath. I think that you
>> can also just modify SPARK_CLASSPATH
>> - run using something like:
>>
>> $ runExample my.class.name arg1 arg2 arg3
>>
>> Hope this helps!
>> Shankari
>>
>>
>> On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <je...@gmail.com> wrote:
>>
>>> I'm having trouble running my Spark program as a "fat jar" on EC2.
>>>
>>> This is the process I'm using:
>>> (1) spark-ec2 script to launch cluster
>>> (2) ssh to master, install sbt and git clone my project's source code
>>> (3) update source to reference correct master and jar
>>> (4) sbt assembly
>>> (5) copy-dir to copy the jar to the rest of the cluster
>>>
>>> I tried both running the jar (java -jar ...) and using sbt run, but I
>>> always end up with this error:
>>>
>>> 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO
>>>  o.a.s.d.client.Client$ClientActor - Connecting to master spark://
>>> ec2-50-16-80-0.compute-1.amazonaws.com:7077
>>> 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR
>>> o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping
>>> client
>>> 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR
>>> o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
>>> 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR
>>> o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster
>>> scheduler: Disconnected from Spark cluster
>>> 18:58:59.844 [delete Spark local dirs] DEBUG
>>> org.apache.spark.storage.DiskStore - Shutdown hook called
>>>
>>>
>>> But when I use spark-shell it has no problems connecting to the master
>>> using the exact same url:
>>>
>>> 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master
>>> spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>>> Spark context available as sc.
>>>
>>> I'm probably missing something obvious so any tips are very appreciated.
>>>
>>
>>
>