You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jianmin Wu <ji...@optaim.com> on 2013/11/01 10:16:44 UTC

facing problem when launching WordCount on mesos

Hi all, 

We are facing the similar problem as in
http://comments.gmane.org/gmane.comp.lang.scala.spark.user/2505 . The mesos
environment looks good with a master and four slaves, according to the
service http://master-ip:5050 .

In the discussion we found the environment variable SPARK_HOME and
SCALA_HOME might be the reason. So we export the environment variable in
sbin/mesos-daemon.sh in the way like:
export SPARK_HOME=/data/hadoop/spark/spark-0.8.0-incubating
export SCALA_HOME=/data/hadoop/scala/scala-2.9.3

The modified sbin/mesos-daemon.sh is then replicated to all the mesos
slaves. We hope in this way, the mesos-slave process will aware the
environment variables (sbin/mesos-daemon.sh is used in the start of
mesos-slave services according to sbin/mesos-start-slaves.sh). But the
launch of worker on mesos still failed and the message reported in stout is
like the following. Could anyone help to suggest the proper way to set up
the environment variable?  Any other logs should we investigated further?

3/11/01 15:53:31 WARN util.Utils: Your hostname, zyz-1 resolves to a
loopback address: 127.0.0.1; using 10.4.1.140 instead (on interface eth0)
13/11/01 15:53:31 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to
another address
13/11/01 15:53:33 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
13/11/01 15:53:33 INFO spark.SparkEnv: Registering BlockManagerMaster
13/11/01 15:53:33 INFO storage.MemoryStore: MemoryStore started with
capacity 562.0 MB.
13/11/01 15:53:34 INFO storage.DiskStore: Created local directory at
/tmp/spark-local-20131101155334-87b1
13/11/01 15:53:34 INFO network.ConnectionManager: Bound socket to port 54842
with id = ConnectionManagerId(proxy.optaim.com,54842)
13/11/01 15:53:34 INFO storage.BlockManagerMaster: Trying to register
BlockManager
13/11/01 15:53:34 INFO storage.BlockManagerMaster: Registered BlockManager
13/11/01 15:53:34 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/11/01 15:53:34 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:47914
13/11/01 15:53:34 INFO broadcast.HttpBroadcast: Broadcast server started at
http://10.4.1.140:47914
13/11/01 15:53:34 INFO spark.SparkEnv: Registering MapOutputTracker
13/11/01 15:53:34 INFO spark.HttpFileServer: HTTP File server directory is
/tmp/spark-0c5d5a09-d719-4d3b-b68a-3694cd80d11e
13/11/01 15:53:34 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/11/01 15:53:34 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:57023
13/11/01 15:53:34 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/storage/rdd,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/storage,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages/stage,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages/pool,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/environment,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/executors,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/metrics/json,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/static,null}
13/11/01 15:53:34 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/,null}
13/11/01 15:53:34 INFO server.AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040
13/11/01 15:53:34 INFO ui.SparkUI: Started Spark Web UI at
http://proxy.optaim.com:4040
13/11/01 15:53:34 INFO spark.SparkContext: Added JAR
/home/jianminwu/dev/spark/spark-0.8.0-incubating/trial/counter/target/simple
-project-1.0.jar at http://10.4.1.140:57023/jars/simple-project-1.0.jar with
timestamp 1383292414868
13/11/01 15:53:35 INFO mesos.MesosSchedulerBackend: Registered as framework
ID 201310312305-2348876810-5050-3555-0001
13/11/01 15:53:36 INFO storage.MemoryStore: ensureFreeSpace(61352) called
with curMem=0, maxMem=589332480
13/11/01 15:53:36 INFO storage.MemoryStore: Block broadcast_0 stored as
values to memory (estimated size 59.9 KB, free 562.0 MB)
13/11/01 15:53:36 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
13/11/01 15:53:36 WARN snappy.LoadSnappy: Snappy native library not loaded
13/11/01 15:53:36 INFO mapred.FileInputFormat: Total input paths to process
: 3
13/11/01 15:53:37 INFO spark.SparkContext: Starting job: collect at
JavaHdfsWordCount.java:54
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Registering RDD 4
(reduceByKey at JavaHdfsWordCount.java:41)
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Got job 0 (collect at
JavaHdfsWordCount.java:54) with 4 output partitions (allowLocal=false)
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Final stage: Stage 0 (collect
at JavaHdfsWordCount.java:54)
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 1)
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Missing parents: List(Stage
1)
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Submitting Stage 1
(MapPartitionsRDD[4] at reduceByKey at JavaHdfsWordCount.java:41), which has
no missing parents
13/11/01 15:53:37 INFO scheduler.DAGScheduler: Submitting 4 missing tasks
from Stage 1 (MapPartitionsRDD[4] at reduceByKey at
JavaHdfsWordCount.java:41)
13/11/01 15:53:37 INFO cluster.ClusterScheduler: Adding task set 1.0 with 4
tasks
13/11/01 15:53:37 INFO cluster.ClusterTaskSetManager: Starting task 1.0:0 as
TID 0 on executor 201310312305-2348876810-5050-3555-7: datanode-05
(PROCESS_LOCAL)
13/11/01 15:53:37 INFO cluster.ClusterTaskSetManager: Serialized task 1.0:0
as 2212 bytes in 10 ms
13/11/01 15:53:37 INFO cluster.ClusterTaskSetManager: Starting task 1.0:1 as
TID 1 on executor 201310312305-2348876810-5050-3555-5: datanode-02
(PROCESS_LOCAL)


Thanks,
Jianmin