You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Anders Bennehag <an...@tajitsu.com> on 2014/02/24 15:55:55 UTC

Nothing happens when executing on cluster

Hello there,

I'm having some trouble with my spark-cluster consisting of

master.censored.dev and
spark-worker-0

Reading from the output of pyspark, master, and worker-node it seems like
the cluster is formed correctly and pyspark connects to it. But for some
reason, nothing happens after "TaskSchedulerImpl: Adding task set". Why is
this and how can I investigate it further?

I haven't really seen any clues in the web-ui.

The program output is as follows:
pypark:
https://gist.githubusercontent.com/PureW/ebe1b95b9b4814fc2533/raw/e2d08b7b6288afad3cb03238acc3d172291166d8/pyspark+log
master:
https://gist.githubusercontent.com/PureW/9889bc9b57a8406599df/raw/4b1faeda8bacff06b5c3a32d75e74ef114933504/Spark-master
worker:
https://gist.githubusercontent.com/PureW/7451cd5ed6780f4d1e33/raw/f45971bd1e6cba620db566998a9afd035ea8d529/spark-worker

The code I am running through pyspark can be seen at
https://gist.github.com/PureW/2c9603bdf1ef2ae772f3
When the worker-node couldn't access the data, it raised an exception, but
now there's nothing at all. I've run the code locally and it only takes
~15s to finish.


Thanks for any help!
/Anders

Re: Nothing happens when executing on cluster

Posted by Anders Bennehag <an...@tajitsu.com>.
I believe I solved my problem. The worker-node didn't know where to return
the answers. I set SPARK_LOCAL_IP and the program runs as it should.


On Mon, Feb 24, 2014 at 3:55 PM, Anders Bennehag <an...@tajitsu.com> wrote:

> Hello there,
>
> I'm having some trouble with my spark-cluster consisting of
>
> master.censored.dev and
> spark-worker-0
>
> Reading from the output of pyspark, master, and worker-node it seems like
> the cluster is formed correctly and pyspark connects to it. But for some
> reason, nothing happens after "TaskSchedulerImpl: Adding task set". Why is
> this and how can I investigate it further?
>
> I haven't really seen any clues in the web-ui.
>
> The program output is as follows:
> pypark:
> https://gist.githubusercontent.com/PureW/ebe1b95b9b4814fc2533/raw/e2d08b7b6288afad3cb03238acc3d172291166d8/pyspark+log
> master:
> https://gist.githubusercontent.com/PureW/9889bc9b57a8406599df/raw/4b1faeda8bacff06b5c3a32d75e74ef114933504/Spark-master
> worker:
> https://gist.githubusercontent.com/PureW/7451cd5ed6780f4d1e33/raw/f45971bd1e6cba620db566998a9afd035ea8d529/spark-worker
>
> The code I am running through pyspark can be seen at
> https://gist.github.com/PureW/2c9603bdf1ef2ae772f3
> When the worker-node couldn't access the data, it raised an exception, but
> now there's nothing at all. I've run the code locally and it only takes
> ~15s to finish.
>
>
> Thanks for any help!
> /Anders
>