You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Anirudha Jadhav <an...@nyu.edu> on 2015/03/23 19:46:52 UTC

newbie quesiton - spark with mesos

i have a mesos cluster, which i deploy spark to by using instructions on
http://spark.apache.org/docs/0.7.2/running-on-mesos.html

after that the spark shell starts up fine.
then i try the following on the shell:

val data = 1 to 10000

val distData = sc.parallelize(data)

distData.filter(_< 10).collect()

open spark web ui at host:4040 and see an active job.

NOW, how do i start workers or spark workers on mesos ? who completes my
job?
thanks,

-- 
Ani

Re: newbie quesiton - spark with mesos

Posted by Dean Wampler <de...@gmail.com>.

I think the problem is the use the loopback address:

export SPARK_LOCAL_IP=127.0.0.1

In the stack trace from the slave, you see this:

...  Reason: Connection refused: localhost/127.0.0.1:51849
akka.actor.ActorNotFound: Actor not found for:
ActorSelection[Anchor(akka.tcp://sparkDriver@localhost:51849/),
Path(/user/MapOutputTracker)]

It's trying to connect to an Akka actor on itself, using the loopback
address.

Try changing SPARK_LOCAL_IP to the publicly routable IP address.

dean


Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Mon, Mar 23, 2015 at 7:37 PM, Anirudha Jadhav <an...@gmail.com>
wrote:

> My bad there, I was using the correct link for docs. The spark shell runs
> correctly, the framework is registered fine on mesos.
>
> is there some setting i am missing:
> this is my spark-env.sh>>>
>
> export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
> export SPARK_EXECUTOR_URI=http://100.125.5.93/sparkx.tgz
> export SPARK_LOCAL_IP=127.0.0.1
>
>
>
> here is what i see on the slave node.
> ----------------
> less
> 20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/stderr
> >>>>>
>
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0324 02:30:29.389225 27755 fetcher.cpp:76] Fetching URI '
> http://100.125.5.93/sparkx.tgz'
> I0324 02:30:29.389361 27755 fetcher.cpp:126] Downloading '
> http://100.125.5.93/sparkx.tgz' to
> '/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/sparkx.tgz'
> I0324 02:30:35.353446 27755 fetcher.cpp:64] Extracted resource
> '/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/sparkx.tgz'
> into
> '/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56'
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 15/03/24 02:30:37 INFO MesosExecutorBackend: Registered signal handlers
> for [TERM, HUP, INT]
> I0324 02:30:37.071077 27863 exec.cpp:132] Version: 0.21.1
> I0324 02:30:37.080971 27885 exec.cpp:206] Executor registered on slave
> 20150226-160708-788888932-5050-8971-S0
> 15/03/24 02:30:37 INFO MesosExecutorBackend: Registered with Mesos as
> executor ID 20150226-160708-788888932-5050-8971-S0 with 1 cpus
> 15/03/24 02:30:37 INFO SecurityManager: Changing view acls to: ubuntu
> 15/03/24 02:30:37 INFO SecurityManager: Changing modify acls to: ubuntu
> 15/03/24 02:30:37 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(ubuntu); users
> with modify permissions: Set(ubuntu)
> 15/03/24 02:30:37 INFO Slf4jLogger: Slf4jLogger started
> 15/03/24 02:30:37 INFO Remoting: Starting remoting
> 15/03/24 02:30:38 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkExecutor@mesos-si2.dny1.bcpc.bloomberg.com:50542]
> 15/03/24 02:30:38 INFO Utils: Successfully started service 'sparkExecutor'
> on port 50542.
> 15/03/24 02:30:38 INFO AkkaUtils: Connecting to MapOutputTracker:
> akka.tcp://sparkDriver@localhost:51849/user/MapOutputTracker
> 15/03/24 02:30:38 WARN Remoting: Tried to associate with unreachable
> remote address [akka.tcp://sparkDriver@localhost:51849]. Address is now
> gated for 5000 ms, all messages to this address will be delivered to dead
> letters. Reason: Connection refused: localhost/127.0.0.1:51849
> akka.actor.ActorNotFound: Actor not found for:
> ActorSelection[Anchor(akka.tcp://sparkDriver@localhost:51849/),
> Path(/user/MapOutputTracker)]
>         at
> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
>         at
> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
>         at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
>         at
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
>         at
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
>         at
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>         at
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>         at
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
>         at
> akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
>         at
> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
>         at
> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110)
>         at
> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
>         at
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
>         at
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
>         at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:267)
>         at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:508)
>         at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:541)
>         at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:531)
>         at
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
>
>
>
>
>
> On Mar 23, 2015, at 3:02 PM, Dean Wampler <de...@gmail.com> wrote:
>
> That's a very old page, try this instead:
>
> http://spark.apache.org/docs/latest/running-on-mesos.html
>
> When you run your Spark job on Mesos, tasks will be started on the slave
> nodes as needed, since "fine-grained" mode is the default.
>
> For a job like your example, very few tasks will be needed. Actually only
> one would be enough, but the default number of partitions will be used. I
> believe 8 is the default for Mesos. For local mode ("local[*]"), it's the
> number of cores. You can also set the propoerty "spark.default.parallelism".
>
> HTH,
>
> Dean
>
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition
> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
> Typesafe <http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://polyglotprogramming.com
>
> On Mon, Mar 23, 2015 at 11:46 AM, Anirudha Jadhav <an...@nyu.edu>
> wrote:
>
>> i have a mesos cluster, which i deploy spark to by using instructions on
>> http://spark.apache.org/docs/0.7.2/running-on-mesos.html
>>
>> after that the spark shell starts up fine.
>> then i try the following on the shell:
>>
>> val data = 1 to 10000
>>
>> val distData = sc.parallelize(data)
>>
>> distData.filter(_< 10).collect()
>>
>> open spark web ui at host:4040 and see an active job.
>>
>> NOW, how do i start workers or spark workers on mesos ? who completes my
>> job?
>> thanks,
>>
>> --
>> Ani
>>
>
>

Re: newbie quesiton - spark with mesos

Posted by Anirudha Jadhav <an...@gmail.com>.

My bad there, I was using the correct link for docs. The spark shell runs
correctly, the framework is registered fine on mesos.

is there some setting i am missing:
this is my spark-env.sh>>>

export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
export SPARK_EXECUTOR_URI=http://100.125.5.93/sparkx.tgz
export SPARK_LOCAL_IP=127.0.0.1



here is what i see on the slave node.
----------------
less
20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/stderr
>>>>>

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0324 02:30:29.389225 27755 fetcher.cpp:76] Fetching URI '
http://100.125.5.93/sparkx.tgz'
I0324 02:30:29.389361 27755 fetcher.cpp:126] Downloading '
http://100.125.5.93/sparkx.tgz' to
'/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/sparkx.tgz'
I0324 02:30:35.353446 27755 fetcher.cpp:64] Extracted resource
'/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56/sparkx.tgz'
into
'/tmp/mesos/slaves/20150226-160708-788888932-5050-8971-S0/frameworks/20150323-205508-788888932-5050-29804-0012/executors/20150226-160708-788888932-5050-8971-S0/runs/cceea834-c4d9-49d6-a579-8352f1889b56'
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
15/03/24 02:30:37 INFO MesosExecutorBackend: Registered signal handlers for
[TERM, HUP, INT]
I0324 02:30:37.071077 27863 exec.cpp:132] Version: 0.21.1
I0324 02:30:37.080971 27885 exec.cpp:206] Executor registered on slave
20150226-160708-788888932-5050-8971-S0
15/03/24 02:30:37 INFO MesosExecutorBackend: Registered with Mesos as
executor ID 20150226-160708-788888932-5050-8971-S0 with 1 cpus
15/03/24 02:30:37 INFO SecurityManager: Changing view acls to: ubuntu
15/03/24 02:30:37 INFO SecurityManager: Changing modify acls to: ubuntu
15/03/24 02:30:37 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(ubuntu); users
with modify permissions: Set(ubuntu)
15/03/24 02:30:37 INFO Slf4jLogger: Slf4jLogger started
15/03/24 02:30:37 INFO Remoting: Starting remoting
15/03/24 02:30:38 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkExecutor@mesos-si2.dny1.bcpc.bloomberg.com:50542]
15/03/24 02:30:38 INFO Utils: Successfully started service 'sparkExecutor'
on port 50542.
15/03/24 02:30:38 INFO AkkaUtils: Connecting to MapOutputTracker:
akka.tcp://sparkDriver@localhost:51849/user/MapOutputTracker
15/03/24 02:30:38 WARN Remoting: Tried to associate with unreachable remote
address [akka.tcp://sparkDriver@localhost:51849]. Address is now gated for
5000 ms, all messages to this address will be delivered to dead letters.
Reason: Connection refused: localhost/127.0.0.1:51849
akka.actor.ActorNotFound: Actor not found for:
ActorSelection[Anchor(akka.tcp://sparkDriver@localhost:51849/),
Path(/user/MapOutputTracker)]
        at
akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
        at
akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
        at
akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
        at
akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
        at
akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
        at
akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
        at
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
        at
akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
        at
akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
        at
akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110)
        at
akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
        at
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
        at
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
        at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:267)
        at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:508)
        at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:541)
        at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:531)
        at
akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)





On Mar 23, 2015, at 3:02 PM, Dean Wampler <de...@gmail.com> wrote:

That's a very old page, try this instead:

http://spark.apache.org/docs/latest/running-on-mesos.html

When you run your Spark job on Mesos, tasks will be started on the slave
nodes as needed, since "fine-grained" mode is the default.

For a job like your example, very few tasks will be needed. Actually only
one would be enough, but the default number of partitions will be used. I
believe 8 is the default for Mesos. For local mode ("local[*]"), it's the
number of cores. You can also set the propoerty "spark.default.parallelism".

HTH,

Dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Mon, Mar 23, 2015 at 11:46 AM, Anirudha Jadhav <an...@nyu.edu> wrote:

> i have a mesos cluster, which i deploy spark to by using instructions on
> http://spark.apache.org/docs/0.7.2/running-on-mesos.html
>
> after that the spark shell starts up fine.
> then i try the following on the shell:
>
> val data = 1 to 10000
>
> val distData = sc.parallelize(data)
>
> distData.filter(_< 10).collect()
>
> open spark web ui at host:4040 and see an active job.
>
> NOW, how do i start workers or spark workers on mesos ? who completes my
> job?
> thanks,
>
> --
> Ani
>

Re: newbie quesiton - spark with mesos

Posted by Dean Wampler <de...@gmail.com>.

That's a very old page, try this instead:

http://spark.apache.org/docs/latest/running-on-mesos.html

When you run your Spark job on Mesos, tasks will be started on the slave
nodes as needed, since "fine-grained" mode is the default.

For a job like your example, very few tasks will be needed. Actually only
one would be enough, but the default number of partitions will be used. I
believe 8 is the default for Mesos. For local mode ("local[*]"), it's the
number of cores. You can also set the propoerty "spark.default.parallelism".

HTH,

Dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Mon, Mar 23, 2015 at 11:46 AM, Anirudha Jadhav <an...@nyu.edu> wrote:

> i have a mesos cluster, which i deploy spark to by using instructions on
> http://spark.apache.org/docs/0.7.2/running-on-mesos.html
>
> after that the spark shell starts up fine.
> then i try the following on the shell:
>
> val data = 1 to 10000
>
> val distData = sc.parallelize(data)
>
> distData.filter(_< 10).collect()
>
> open spark web ui at host:4040 and see an active job.
>
> NOW, how do i start workers or spark workers on mesos ? who completes my
> job?
> thanks,
>
> --
> Ani
>