You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Samarth Mailinglist <ma...@gmail.com> on 2014/11/10 11:55:48 UTC

Spark Web UI is not showing Running / Completed / Active Applications

There are no applications being shown in the dashboard (I am attaching a
screenshot):

[image: Inline image 1]

This is my spark-env.sh:

SPARK_MASTER_WEBUI_PORT=8888

SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node

SPARK_HISTORY_OPTS="
-Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to
set config properties only for the history server (e.g. "-Dx=y")

I have started the history server too..
​

Re: Spark Web UI is not showing Running / Completed / Active Applications

Posted by Samarth Mailinglist <ma...@gmail.com>.
Perfect thanks. I was using the Local IP address, and not the one displayed
in the Web UI. Working fine now!

On Tue, Nov 11, 2014 at 2:02 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> It says
>
> Could not connect to akka.tcp://[sparkMaster@*192.**168.1.222:*7077](
> http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException:
> Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>
> Which means your master is down for some reason. Make sure you are using
> the same version of spark in your application.  Also make sure your spark
> url is provided as the one you are seeing in the below image
>
> [image: Inline image 1]
>
>
>
> Thanks
> Best Regards
>
> On Tue, Nov 11, 2014 at 1:35 PM, Samarth Mailinglist <
> mailinglistsamarth@gmail.com> wrote:
>
>> This does not work, for some reason:
>>
>> ...
>> 14/11/11 13:30:54 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
>> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:30:54 INFO storage.MemoryStore: ensureFreeSpace(175305) called with curMem=0, maxMem=277842493
>> 14/11/11 13:30:54 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 171.2 KB, free 264.8 MB)
>> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(12937) called with curMem=175305, maxMem=277842493
>> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 12.6 KB, free 264.8 MB)
>> 14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on terajoin.local:39540 (size: 12.6 KB, free: 265.0 MB)
>> 14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
>> 14/11/11 13:30:55 INFO mapred.FileInputFormat: Total input paths to process : 1
>> 14/11/11 13:30:55 INFO spark.SparkContext: Starting job: runJob at PythonRDD.scala:296
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Got job 0 (runJob at PythonRDD.scala:296) with 1 output partitions (allowLocal=true)
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:296)
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Parents of final stage: List()
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Missing parents: List()
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting Stage 0 (PythonRDD[3] at RDD at PythonRDD.scala:43), which has no missing parents
>> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(5800) called with curMem=188242, maxMem=277842493
>> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 5.7 KB, free 264.8 MB)
>> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(3773) called with curMem=194042, maxMem=277842493
>> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.7 KB, free 264.8 MB)
>> 14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on terajoin.local:39540 (size: 3.7 KB, free: 265.0 MB)
>> 14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of block broadcast_1_piece0
>> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[3] at RDD at PythonRDD.scala:43)
>> 14/11/11 13:30:55 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
>> 14/11/11 13:31:10 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
>> 14/11/11 13:31:14 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.1.222:7077...
>> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:25 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
>> 14/11/11 13:31:34 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.1.222:7077...
>> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
>> 14/11/11 13:31:40 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
>> 14/11/11 13:31:54 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
>> 14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
>> 14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
>> 14/11/11 13:31:54 INFO scheduler.DAGScheduler: Failed to run runJob at PythonRDD.scala:296
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
>> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
>> Traceback (most recent call last):
>>   File "/xxx", line 36, in <module>
>>     model = LogisticRegressionWithSGD.train(trainData)
>>   File "/usr/local/spark/python/pyspark/mllib/classification.py", line 110, in train
>>     initialWeights)
>>   File "/usr/local/spark/python/pyspark/mllib/_common.py", line 430, in _regression_train_wrapper
>>     initial_weights = _get_initial_weights(initial_weights, data)
>>   File "/usr/local/spark/python/pyspark/mllib/_common.py", line 415, in _get_initial_weights
>>     initial_weights = _convert_vector(data.first().features)
>>   File "/usr/local/spark/python/pyspark/rdd.py", line 1167, in first
>>     return self.take(1)[0]
>>   File "/usr/local/spark/python/pyspark/rdd.py", line 1153, in take
>>     res = self.context.runJob(self, takeUpToNumLeft, p, True)
>>   File "/usr/local/spark/python/pyspark/context.py", line 770, in runJob
>>     it = self._jvm.PythonRDD.runJob(self._[jsc.sc](http://jsc.sc)(), mappedRDD._jrdd, javaPartitions, allowLocal)
>>   File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
>>   File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
>> py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
>> : org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
>>     at [org.apache.spark.scheduler.DAGScheduler.org](http://org.apache.spark.scheduler.DAGScheduler.org)$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
>>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
>>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
>>     at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>     at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
>>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>>     at scala.Option.foreach(Option.scala:236)
>>     at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
>>     at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
>>     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>     at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>     at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>     at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>     at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>     at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>     at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>     at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> 14/11/11 13:31:54 INFO ui.SparkUI: Stopped Spark web UI at [http://xxxx:4040](http://xxxx:4040)
>> 14/11/11 13:31:54 INFO scheduler.DAGScheduler: Stopping DAGScheduler
>> 14/11/11 13:31:54 INFO cluster.SparkDeploySchedulerBackend: Shutting down all executors
>>
>> It only works when I use local.
>>
>> On Mon, Nov 10, 2014 at 5:09 PM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>> Change this to
>>>
>>> spark-submit --master local[8] ~/main/py/file --py-files
>>> ~/some/other/files
>>>
>>> this
>>>
>>> spark-submit --master spark://blurred-part:7077 ~/main/py/file
>>> --py-files ~/some/other/files
>>>
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Mon, Nov 10, 2014 at 4:55 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> You could be running your application in *local* mode. In the
>>>> application specify the master as spark://blurred-part:7077 and then it
>>>> will appear in the running list.
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Mon, Nov 10, 2014 at 4:25 PM, Samarth Mailinglist <
>>>> mailinglistsamarth@gmail.com> wrote:
>>>>
>>>>> There are no applications being shown in the dashboard (I am attaching
>>>>> a screenshot):
>>>>>
>>>>> [image: Inline image 1]
>>>>>
>>>>> This is my spark-env.sh:
>>>>>
>>>>> SPARK_MASTER_WEBUI_PORT=8888
>>>>>
>>>>> SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node
>>>>>
>>>>> SPARK_HISTORY_OPTS=" -Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to set config properties only for the history server (e.g. "-Dx=y")
>>>>>
>>>>> I have started the history server too..
>>>>> ​
>>>>>
>>>>
>>>>
>>>  ​
>>
>
>

Re: Spark Web UI is not showing Running / Completed / Active Applications

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
It says

Could not connect to akka.tcp://[sparkMaster@*192.**168.1.222:*7077](
http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException:
Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]

Which means your master is down for some reason. Make sure you are using
the same version of spark in your application.  Also make sure your spark
url is provided as the one you are seeing in the below image

[image: Inline image 1]



Thanks
Best Regards

On Tue, Nov 11, 2014 at 1:35 PM, Samarth Mailinglist <
mailinglistsamarth@gmail.com> wrote:

> This does not work, for some reason:
>
> ...
> 14/11/11 13:30:54 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:30:54 INFO storage.MemoryStore: ensureFreeSpace(175305) called with curMem=0, maxMem=277842493
> 14/11/11 13:30:54 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 171.2 KB, free 264.8 MB)
> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(12937) called with curMem=175305, maxMem=277842493
> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 12.6 KB, free 264.8 MB)
> 14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on terajoin.local:39540 (size: 12.6 KB, free: 265.0 MB)
> 14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
> 14/11/11 13:30:55 INFO mapred.FileInputFormat: Total input paths to process : 1
> 14/11/11 13:30:55 INFO spark.SparkContext: Starting job: runJob at PythonRDD.scala:296
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Got job 0 (runJob at PythonRDD.scala:296) with 1 output partitions (allowLocal=true)
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:296)
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Parents of final stage: List()
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Missing parents: List()
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting Stage 0 (PythonRDD[3] at RDD at PythonRDD.scala:43), which has no missing parents
> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(5800) called with curMem=188242, maxMem=277842493
> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 5.7 KB, free 264.8 MB)
> 14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(3773) called with curMem=194042, maxMem=277842493
> 14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.7 KB, free 264.8 MB)
> 14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on terajoin.local:39540 (size: 3.7 KB, free: 265.0 MB)
> 14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of block broadcast_1_piece0
> 14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[3] at RDD at PythonRDD.scala:43)
> 14/11/11 13:30:55 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
> 14/11/11 13:31:10 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
> 14/11/11 13:31:14 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.1.222:7077...
> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:25 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
> 14/11/11 13:31:34 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.1.222:7077...
> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077): akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.1.222:7077]
> 14/11/11 13:31:40 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
> 14/11/11 13:31:54 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
> 14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
> 14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
> 14/11/11 13:31:54 INFO scheduler.DAGScheduler: Failed to run runJob at PythonRDD.scala:296
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
> 14/11/11 13:31:54 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
> Traceback (most recent call last):
>   File "/xxx", line 36, in <module>
>     model = LogisticRegressionWithSGD.train(trainData)
>   File "/usr/local/spark/python/pyspark/mllib/classification.py", line 110, in train
>     initialWeights)
>   File "/usr/local/spark/python/pyspark/mllib/_common.py", line 430, in _regression_train_wrapper
>     initial_weights = _get_initial_weights(initial_weights, data)
>   File "/usr/local/spark/python/pyspark/mllib/_common.py", line 415, in _get_initial_weights
>     initial_weights = _convert_vector(data.first().features)
>   File "/usr/local/spark/python/pyspark/rdd.py", line 1167, in first
>     return self.take(1)[0]
>   File "/usr/local/spark/python/pyspark/rdd.py", line 1153, in take
>     res = self.context.runJob(self, takeUpToNumLeft, p, True)
>   File "/usr/local/spark/python/pyspark/context.py", line 770, in runJob
>     it = self._jvm.PythonRDD.runJob(self._[jsc.sc](http://jsc.sc)(), mappedRDD._jrdd, javaPartitions, allowLocal)
>   File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
>   File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
> : org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
>     at [org.apache.spark.scheduler.DAGScheduler.org](http://org.apache.spark.scheduler.DAGScheduler.org)$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
>     at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>     at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>     at scala.Option.foreach(Option.scala:236)
>     at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
>     at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
>     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>     at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>     at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>     at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>     at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>     at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>     at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>     at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> 14/11/11 13:31:54 INFO ui.SparkUI: Stopped Spark web UI at [http://xxxx:4040](http://xxxx:4040)
> 14/11/11 13:31:54 INFO scheduler.DAGScheduler: Stopping DAGScheduler
> 14/11/11 13:31:54 INFO cluster.SparkDeploySchedulerBackend: Shutting down all executors
>
> It only works when I use local.
>
> On Mon, Nov 10, 2014 at 5:09 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
> Change this to
>>
>> spark-submit --master local[8] ~/main/py/file --py-files
>> ~/some/other/files
>>
>> this
>>
>> spark-submit --master spark://blurred-part:7077 ~/main/py/file --py-files
>> ~/some/other/files
>>
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Nov 10, 2014 at 4:55 PM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>>> You could be running your application in *local* mode. In the
>>> application specify the master as spark://blurred-part:7077 and then it
>>> will appear in the running list.
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Mon, Nov 10, 2014 at 4:25 PM, Samarth Mailinglist <
>>> mailinglistsamarth@gmail.com> wrote:
>>>
>>>> There are no applications being shown in the dashboard (I am attaching
>>>> a screenshot):
>>>>
>>>> [image: Inline image 1]
>>>>
>>>> This is my spark-env.sh:
>>>>
>>>> SPARK_MASTER_WEBUI_PORT=8888
>>>>
>>>> SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node
>>>>
>>>> SPARK_HISTORY_OPTS=" -Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to set config properties only for the history server (e.g. "-Dx=y")
>>>>
>>>> I have started the history server too..
>>>> ​
>>>>
>>>
>>>
>>  ​
>

Re: Spark Web UI is not showing Running / Completed / Active Applications

Posted by Samarth Mailinglist <ma...@gmail.com>.
This does not work, for some reason:

...
14/11/11 13:30:54 INFO cluster.SparkDeploySchedulerBackend:
SchedulerBackend is ready for scheduling beginning after reached
minRegisteredResourcesRatio: 0.0
14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:30:54 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:30:54 INFO storage.MemoryStore: ensureFreeSpace(175305)
called with curMem=0, maxMem=277842493
14/11/11 13:30:54 INFO storage.MemoryStore: Block broadcast_0 stored
as values in memory (estimated size 171.2 KB, free 264.8 MB)
14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(12937)
called with curMem=175305, maxMem=277842493
14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_0_piece0
stored as bytes in memory (estimated size 12.6 KB, free 264.8 MB)
14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added
broadcast_0_piece0 in memory on terajoin.local:39540 (size: 12.6 KB,
free: 265.0 MB)
14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of
block broadcast_0_piece0
14/11/11 13:30:55 INFO mapred.FileInputFormat: Total input paths to process : 1
14/11/11 13:30:55 INFO spark.SparkContext: Starting job: runJob at
PythonRDD.scala:296
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Got job 0 (runJob at
PythonRDD.scala:296) with 1 output partitions (allowLocal=true)
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Final stage: Stage
0(runJob at PythonRDD.scala:296)
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Parents of final stage: List()
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting Stage 0
(PythonRDD[3] at RDD at PythonRDD.scala:43), which has no missing
parents
14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(5800)
called with curMem=188242, maxMem=277842493
14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1 stored
as values in memory (estimated size 5.7 KB, free 264.8 MB)
14/11/11 13:30:55 INFO storage.MemoryStore: ensureFreeSpace(3773)
called with curMem=194042, maxMem=277842493
14/11/11 13:30:55 INFO storage.MemoryStore: Block broadcast_1_piece0
stored as bytes in memory (estimated size 3.7 KB, free 264.8 MB)
14/11/11 13:30:55 INFO storage.BlockManagerInfo: Added
broadcast_1_piece0 in memory on terajoin.local:39540 (size: 3.7 KB,
free: 265.0 MB)
14/11/11 13:30:55 INFO storage.BlockManagerMaster: Updated info of
block broadcast_1_piece0
14/11/11 13:30:55 INFO scheduler.DAGScheduler: Submitting 1 missing
tasks from Stage 0 (PythonRDD[3] at RDD at PythonRDD.scala:43)
14/11/11 13:30:55 INFO scheduler.TaskSchedulerImpl: Adding task set
0.0 with 1 tasks
14/11/11 13:31:10 WARN scheduler.TaskSchedulerImpl: Initial job has
not accepted any resources; check your cluster UI to ensure that
workers are registered and have sufficient memory
14/11/11 13:31:14 INFO client.AppClient$ClientActor: Connecting to
master spark://192.168.1.222:7077...
14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:14 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:25 WARN scheduler.TaskSchedulerImpl: Initial job has
not accepted any resources; check your cluster UI to ensure that
workers are registered and have sufficient memory
14/11/11 13:31:34 INFO client.AppClient$ClientActor: Connecting to
master spark://192.168.1.222:7077...
14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:34 WARN client.AppClient$ClientActor: Could not connect
to akka.tcp://[sparkMaster@192.168.1.222:7077](http://sparkMaster@192.168.1.222:7077):
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@192.168.1.222:7077]
14/11/11 13:31:40 WARN scheduler.TaskSchedulerImpl: Initial job has
not accepted any resources; check your cluster UI to ensure that
workers are registered and have sufficient memory
14/11/11 13:31:54 ERROR cluster.SparkDeploySchedulerBackend:
Application has been killed. Reason: All masters are unresponsive!
Giving up.
14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Removed TaskSet
0.0, whose tasks have all completed, from pool
14/11/11 13:31:54 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
14/11/11 13:31:54 INFO scheduler.DAGScheduler: Failed to run runJob at
PythonRDD.scala:296
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/metrics/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/static,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/executors/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/executors,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/environment/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/environment,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/rdd,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/pool/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/pool,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/json,null}
14/11/11 13:31:54 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages,null}
Traceback (most recent call last):
  File "/xxx", line 36, in <module>
    model = LogisticRegressionWithSGD.train(trainData)
  File "/usr/local/spark/python/pyspark/mllib/classification.py", line
110, in train
    initialWeights)
  File "/usr/local/spark/python/pyspark/mllib/_common.py", line 430,
in _regression_train_wrapper
    initial_weights = _get_initial_weights(initial_weights, data)
  File "/usr/local/spark/python/pyspark/mllib/_common.py", line 415,
in _get_initial_weights
    initial_weights = _convert_vector(data.first().features)
  File "/usr/local/spark/python/pyspark/rdd.py", line 1167, in first
    return self.take(1)[0]
  File "/usr/local/spark/python/pyspark/rdd.py", line 1153, in take
    res = self.context.runJob(self, takeUpToNumLeft, p, True)
  File "/usr/local/spark/python/pyspark/context.py", line 770, in runJob
    it = self._jvm.PythonRDD.runJob(self._[jsc.sc](http://jsc.sc)(),
mappedRDD._jrdd, javaPartitions, allowLocal)
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 538, in __call__
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure:
All masters are unresponsive! Giving up.
    at [org.apache.spark.scheduler.DAGScheduler.org](http://org.apache.spark.scheduler.DAGScheduler.org)$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

14/11/11 13:31:54 INFO ui.SparkUI: Stopped Spark web UI at
[http://xxxx:4040](http://xxxx:4040)
14/11/11 13:31:54 INFO scheduler.DAGScheduler: Stopping DAGScheduler
14/11/11 13:31:54 INFO cluster.SparkDeploySchedulerBackend: Shutting
down all executors

It only works when I use local.

On Mon, Nov 10, 2014 at 5:09 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

Change this to
>
> spark-submit --master local[8] ~/main/py/file --py-files ~/some/other/files
>
> this
>
> spark-submit --master spark://blurred-part:7077 ~/main/py/file --py-files
> ~/some/other/files
>
>
> Thanks
> Best Regards
>
> On Mon, Nov 10, 2014 at 4:55 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> You could be running your application in *local* mode. In the
>> application specify the master as spark://blurred-part:7077 and then it
>> will appear in the running list.
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Nov 10, 2014 at 4:25 PM, Samarth Mailinglist <
>> mailinglistsamarth@gmail.com> wrote:
>>
>>> There are no applications being shown in the dashboard (I am attaching a
>>> screenshot):
>>>
>>> [image: Inline image 1]
>>>
>>> This is my spark-env.sh:
>>>
>>> SPARK_MASTER_WEBUI_PORT=8888
>>>
>>> SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node
>>>
>>> SPARK_HISTORY_OPTS=" -Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to set config properties only for the history server (e.g. "-Dx=y")
>>>
>>> I have started the history server too..
>>> ​
>>>
>>
>>
>  ​

Re: Spark Web UI is not showing Running / Completed / Active Applications

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Change this to

spark-submit --master local[8] ~/main/py/file --py-files ~/some/other/files

this

spark-submit --master spark://blurred-part:7077 ~/main/py/file --py-files
~/some/other/files


Thanks
Best Regards

On Mon, Nov 10, 2014 at 4:55 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> You could be running your application in *local* mode. In the application
> specify the master as spark://blurred-part:7077 and then it will appear in
> the running list.
>
> Thanks
> Best Regards
>
> On Mon, Nov 10, 2014 at 4:25 PM, Samarth Mailinglist <
> mailinglistsamarth@gmail.com> wrote:
>
>> There are no applications being shown in the dashboard (I am attaching a
>> screenshot):
>>
>> [image: Inline image 1]
>>
>> This is my spark-env.sh:
>>
>> SPARK_MASTER_WEBUI_PORT=8888
>>
>> SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node
>>
>> SPARK_HISTORY_OPTS=" -Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to set config properties only for the history server (e.g. "-Dx=y")
>>
>> I have started the history server too..
>> ​
>>
>
>

Re: Spark Web UI is not showing Running / Completed / Active Applications

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
You could be running your application in *local* mode. In the application
specify the master as spark://blurred-part:7077 and then it will appear in
the running list.

Thanks
Best Regards

On Mon, Nov 10, 2014 at 4:25 PM, Samarth Mailinglist <
mailinglistsamarth@gmail.com> wrote:

> There are no applications being shown in the dashboard (I am attaching a
> screenshot):
>
> [image: Inline image 1]
>
> This is my spark-env.sh:
>
> SPARK_MASTER_WEBUI_PORT=8888
>
> SPARK_WORKER_INSTANCES=8 #to set the number of worker processes per node
>
> SPARK_HISTORY_OPTS=" -Dspark.history.fs.logDirectory=/usr/local/spark/history-logs/" #, to set config properties only for the history server (e.g. "-Dx=y")
>
> I have started the history server too..
> ​
>