You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by praveshjain1991 <pr...@gmail.com> on 2014/06/03 16:43:00 UTC

Spark not working with mesos

I set up Spark-0.9.1 to run on mesos-0.13.0 using the steps mentioned  here
<https://spark.apache.org/docs/0.9.1/running-on-mesos.html>  . The Mesos UI
is showing two workers registered. I want to run these commands on
Spark-shell

> scala> val data = 1 to 10000 data:
> scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6,
> 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
> 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
> 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
> 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
> 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
> 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
> 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121,
> 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
> 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149,
> 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163,
> 164, 165, 166, 167, 168, 169, 170...
 
 
> scala> val distData = sc.parallelize(data) distData:
> org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at
> parallelize at <console>:14

Now when i run the collect method, the following error occurs.

> scala> distData.filter(_< 10).collect()
14/06/03 19:54:55 INFO SparkContext: Starting job: collect at <console>:17
14/06/03 19:54:55 INFO DAGScheduler: Got job 0 (collect at <console>:17)
with 8 output partitions (allowLocal=false)
14/06/03 19:54:55 INFO DAGScheduler: Final stage: Stage 0 (collect at
<console>:17)
14/06/03 19:54:55 INFO DAGScheduler: Parents of final stage: List()
14/06/03 19:54:55 INFO DAGScheduler: Missing parents: List()
14/06/03 19:54:55 INFO DAGScheduler: Submitting Stage 0 (FilteredRDD[1] at
filter at <console>:17), which has no missing parents
14/06/03 19:54:55 INFO DAGScheduler: Submitting 8 missing tasks from Stage 0
(FilteredRDD[1] at filter at <console>:17)
14/06/03 19:54:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 8 tasks
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
in 8 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:2 as TID 2 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:3 as TID 3 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
in 1 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:4 as TID 4 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:5 as TID 5 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:6 as TID 6 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:7 as TID 7 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Re-queueing tasks for
201406031732-3213994176-5050-6320-10 from TaskSet 0.0
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 5 (task 0.0:5)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 7 (task 0.0:7)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 3 (task 0.0:3)
14/06/03 19:54:56 INFO DAGScheduler: Executor lost:
201406031732-3213994176-5050-6320-10 (epoch 0)
14/06/03 19:54:56 INFO BlockManagerMasterActor: Trying to remove executor
201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
14/06/03 19:54:56 INFO BlockManagerMaster: Removed
201406031732-3213994176-5050-6320-10 successfully in removeExecutor
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:3 as TID 8 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
in 0 ms
14/06/03 19:54:56 INFO DAGScheduler: Host gained which was in lost list
earlier: IMPETUS-DSRV04.impetus.co.in
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:1 as TID 9 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:7 as TID 10 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:5 as TID 11 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for
201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 8 (task 0.0:3)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 2 (task 0.0:2)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 4 (task 0.0:4)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 10 (task 0.0:7)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 6 (task 0.0:6)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
14/06/03 19:54:57 INFO DAGScheduler: Executor lost:
201406031732-3213994176-5050-6320-11 (epoch 1)
14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor
201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:57 INFO BlockManagerMaster: Removed
201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list
earlier: IMPETUS-DSRV05.impetus.co.in
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 12 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:6 as TID 13 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 14 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:4 as TID 15 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 16 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:3 as TID 17 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for
201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 14 (task 0.0:7)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 16 (task 0.0:2)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 12 (task 0.0:0)
14/06/03 19:54:57 INFO DAGScheduler: Executor lost:
201406031732-3213994176-5050-6320-11 (epoch 2)
14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor
201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:57 INFO BlockManagerMaster: Removed
201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list
earlier: IMPETUS-DSRV05.impetus.co.in
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 18 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 19 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 20 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Re-queueing tasks for
201406031732-3213994176-5050-6320-10 from TaskSet 0.0
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 17 (task 0.0:3)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 11 (task 0.0:5)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 13 (task 0.0:6)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 15 (task 0.0:4)
14/06/03 19:54:58 INFO DAGScheduler: Executor lost:
201406031732-3213994176-5050-6320-10 (epoch 3)
14/06/03 19:54:58 INFO BlockManagerMasterActor: Trying to remove executor
201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
14/06/03 19:54:58 INFO BlockManagerMaster: Removed
201406031732-3213994176-5050-6320-10 successfully in removeExecutor
14/06/03 19:54:58 INFO DAGScheduler: Host gained which was in lost list
earlier: IMPETUS-DSRV04.impetus.co.in
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:4 as TID 21 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:1 as TID 22 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:6 as TID 23 on
executor 201406031732-3213994176-5050-6320-11: IMPETUS-DSRV05.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:5 as TID 24 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
in 1 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:3 as TID 25 on
executor 201406031732-3213994176-5050-6320-10: IMPETUS-DSRV04.impetus.co.in
(PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
in 0 ms
14/06/03 19:54:59 INFO TaskSetManager: Re-queueing tasks for
201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:59 WARN TaskSetManager: Lost TID 23 (task 0.0:6)
14/06/03 19:54:59 WARN TaskSetManager: Lost TID 20 (task 0.0:7)
14/06/03 19:54:59 ERROR TaskSetManager: Task 0.0:7 failed 4 times; aborting
job
14/06/03 19:54:59 INFO DAGScheduler: Failed to run collect at <console>:17
14/06/03 19:54:59 INFO DAGScheduler: Executor lost:
201406031732-3213994176-5050-6320-11 (epoch 4)
14/06/03 19:54:59 INFO BlockManagerMasterActor: Trying to remove executor
201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:59 INFO BlockManagerMaster: Removed
201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:59 INFO DAGScheduler: Host gained which was in lost list
earlier: IMPETUS-DSRV05.impetus.co.in
org.apache.spark.SparkException: Job aborted: Task 0.0:7 failed 4 times
(most recent failure: unknown)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
        at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
        at scala.Option.foreach(Option.scala:236)
        at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 
> 
> scala> 14/06/03 19:55:00 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-10 from TaskSet 0.0 14/06/03
> 19:55:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
> all completed, from pool 14/06/03 19:55:00 INFO DAGScheduler: Executor
> lost: 201406031732-3213994176-5050-6320-10 (epoch 5) 14/06/03 19:55:00
> INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-10 from BlockManagerMaster. 14/06/03
> 19:55:00 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
> 14/06/03 19:55:00 INFO DAGScheduler: Host gained which was in lost
> list earlier: IMPETUS-DSRV04.impetus.co.in

I've checked my configuration of spark many times and it looks fine to me.
Any ideas what might have gone wrong?

--
Thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
1. Make sure your spark-*.tgz that you created by make_distribution.sh is
accessible by all the slaves nodes.

2. Check the worker node logs.



Thanks
Best Regards


On Tue, Jun 3, 2014 at 8:13 PM, praveshjain1991 <pr...@gmail.com>
wrote:

> I set up Spark-0.9.1 to run on mesos-0.13.0 using the steps mentioned  here
> <https://spark.apache.org/docs/0.9.1/running-on-mesos.html>  . The Mesos
> UI
> is showing two workers registered. I want to run these commands on
> Spark-shell
>
> > scala> val data = 1 to 10000 data:
> > scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6,
> > 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
> > 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
> > 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
> > 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
> > 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
> > 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
> > 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121,
> > 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
> > 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149,
> > 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163,
> > 164, 165, 166, 167, 168, 169, 170...
>
>
> > scala> val distData = sc.parallelize(data) distData:
> > org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at
> > parallelize at <console>:14
>
> Now when i run the collect method, the following error occurs.
>
> > scala> distData.filter(_< 10).collect()
> 14/06/03 19:54:55 INFO SparkContext: Starting job: collect at <console>:17
> 14/06/03 19:54:55 INFO DAGScheduler: Got job 0 (collect at <console>:17)
> with 8 output partitions (allowLocal=false)
> 14/06/03 19:54:55 INFO DAGScheduler: Final stage: Stage 0 (collect at
> <console>:17)
> 14/06/03 19:54:55 INFO DAGScheduler: Parents of final stage: List()
> 14/06/03 19:54:55 INFO DAGScheduler: Missing parents: List()
> 14/06/03 19:54:55 INFO DAGScheduler: Submitting Stage 0 (FilteredRDD[1] at
> filter at <console>:17), which has no missing parents
> 14/06/03 19:54:55 INFO DAGScheduler: Submitting 8 missing tasks from Stage
> 0
> (FilteredRDD[1] at filter at <console>:17)
> 14/06/03 19:54:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 8 tasks
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
> in 8 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:2 as TID 2 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:3 as TID 3 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
> in 1 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:4 as TID 4 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:5 as TID 5 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:6 as TID 6 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:7 as TID 7 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:56 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-10 from TaskSet 0.0
> 14/06/03 19:54:56 WARN TaskSetManager: Lost TID 5 (task 0.0:5)
> 14/06/03 19:54:56 WARN TaskSetManager: Lost TID 7 (task 0.0:7)
> 14/06/03 19:54:56 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
> 14/06/03 19:54:56 WARN TaskSetManager: Lost TID 3 (task 0.0:3)
> 14/06/03 19:54:56 INFO DAGScheduler: Executor lost:
> 201406031732-3213994176-5050-6320-10 (epoch 0)
> 14/06/03 19:54:56 INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
> 14/06/03 19:54:56 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
> 14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:3 as TID 8 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:56 INFO DAGScheduler: Host gained which was in lost list
> earlier: IMPETUS-DSRV04.impetus.co.in
> 14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:1 as TID 9 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:7 as TID 10 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:5 as TID 11 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 8 (task 0.0:3)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 2 (task 0.0:2)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 4 (task 0.0:4)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 10 (task 0.0:7)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 6 (task 0.0:6)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
> 14/06/03 19:54:57 INFO DAGScheduler: Executor lost:
> 201406031732-3213994176-5050-6320-11 (epoch 1)
> 14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
> 14/06/03 19:54:57 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
> 14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list
> earlier: IMPETUS-DSRV05.impetus.co.in
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 12 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
> in 1 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:6 as TID 13 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 14 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
> in 1 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:4 as TID 15 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 16 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:3 as TID 17 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
> in 1 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 14 (task 0.0:7)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 16 (task 0.0:2)
> 14/06/03 19:54:57 WARN TaskSetManager: Lost TID 12 (task 0.0:0)
> 14/06/03 19:54:57 INFO DAGScheduler: Executor lost:
> 201406031732-3213994176-5050-6320-11 (epoch 2)
> 14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
> 14/06/03 19:54:57 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
> 14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list
> earlier: IMPETUS-DSRV05.impetus.co.in
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 18 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 19 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 20 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:58 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-10 from TaskSet 0.0
> 14/06/03 19:54:58 WARN TaskSetManager: Lost TID 17 (task 0.0:3)
> 14/06/03 19:54:58 WARN TaskSetManager: Lost TID 11 (task 0.0:5)
> 14/06/03 19:54:58 WARN TaskSetManager: Lost TID 13 (task 0.0:6)
> 14/06/03 19:54:58 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
> 14/06/03 19:54:58 WARN TaskSetManager: Lost TID 15 (task 0.0:4)
> 14/06/03 19:54:58 INFO DAGScheduler: Executor lost:
> 201406031732-3213994176-5050-6320-10 (epoch 3)
> 14/06/03 19:54:58 INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
> 14/06/03 19:54:58 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
> 14/06/03 19:54:58 INFO DAGScheduler: Host gained which was in lost list
> earlier: IMPETUS-DSRV04.impetus.co.in
> 14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:4 as TID 21 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:1 as TID 22 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:6 as TID 23 on
> executor 201406031732-3213994176-5050-6320-11:
> IMPETUS-DSRV05.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:5 as TID 24 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes
> in 1 ms
> 14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:3 as TID 25 on
> executor 201406031732-3213994176-5050-6320-10:
> IMPETUS-DSRV04.impetus.co.in
> (PROCESS_LOCAL)
> 14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes
> in 0 ms
> 14/06/03 19:54:59 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
> 14/06/03 19:54:59 WARN TaskSetManager: Lost TID 23 (task 0.0:6)
> 14/06/03 19:54:59 WARN TaskSetManager: Lost TID 20 (task 0.0:7)
> 14/06/03 19:54:59 ERROR TaskSetManager: Task 0.0:7 failed 4 times; aborting
> job
> 14/06/03 19:54:59 INFO DAGScheduler: Failed to run collect at <console>:17
> 14/06/03 19:54:59 INFO DAGScheduler: Executor lost:
> 201406031732-3213994176-5050-6320-11 (epoch 4)
> 14/06/03 19:54:59 INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
> 14/06/03 19:54:59 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
> 14/06/03 19:54:59 INFO DAGScheduler: Host gained which was in lost list
> earlier: IMPETUS-DSRV05.impetus.co.in
> org.apache.spark.SparkException: Job aborted: Task 0.0:7 failed 4 times
> (most recent failure: unknown)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
>         at
>
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at
> org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
>         at scala.Option.foreach(Option.scala:236)
>         at
>
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>         at
>
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>         at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at
>
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >
> >
> > scala> 14/06/03 19:55:00 INFO TaskSetManager: Re-queueing tasks for
> > 201406031732-3213994176-5050-6320-10 from TaskSet 0.0 14/06/03
> > 19:55:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
> > all completed, from pool 14/06/03 19:55:00 INFO DAGScheduler: Executor
> > lost: 201406031732-3213994176-5050-6320-10 (epoch 5) 14/06/03 19:55:00
> > INFO BlockManagerMasterActor: Trying to remove executor
> > 201406031732-3213994176-5050-6320-10 from BlockManagerMaster. 14/06/03
> > 19:55:00 INFO BlockManagerMaster: Removed
> > 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
> > 14/06/03 19:55:00 INFO DAGScheduler: Host gained which was in lost
> > list earlier: IMPETUS-DSRV04.impetus.co.in
>
> I've checked my configuration of spark many times and it looks fine to me.
> Any ideas what might have gone wrong?
>
> --
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Spark not working with mesos

Posted by praveshjain1991 <pr...@gmail.com>.
Hi Ajatix. 

Yes the HADOOP_HOME is set on the nodes and i did update the bash.

As I said, adding MESOS_HADOOP_HOME did not work.

But what is causing the original error : "Java.lang.Error:
java.io.IOException: failure to login " ?

--

Thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p7048.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by ajatix <aj...@sigmoidanalytics.com>.
I do assume that you've added HADOOP_HOME to you environment variables.
Otherwise, you could fill the actual path of hadoop on your cluster. Also,
did you do update the bash?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p7040.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by praveshjain1991 <pr...@gmail.com>.
Thanks for the reply Ajatix.

Adding MESOS_HADOOP_HOME to my .bashrc gives an error while trying to start
mesos-master:

Failed to load unknown flag 'hadoop_home'
Usage: lt-mesos-master [...]

Couldn't get any help on this from google. Any suggestions?

--

Thanks.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p7021.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by ajatix <aj...@sigmoidanalytics.com>.
Since $HADOOP_HOME is deprecated, try adding it to the Mesos configuration
file. 
Add `export MESOS_HADOOP_HOME=$HADOOP_HOME to ~/.bashrc` and that should
solve your error



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p6939.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by praveshjain1991 <pr...@gmail.com>.
Thanks for the reply Akhil

I saw the logs in /tmp/mesos and found that my tar.gz was not properly
created. I corrected that but now got another error which i can't find an
answer for on google.

The error is pretty much the same
"org.apache.spark.SparkException: Job aborted: Task 0.0:6 failed 4 times
(most recent failure: unknown)"

The logs(stderr) on the worker says:

Warning: $HADOOP_HOME is deprecated.

14/06/04 19:55:26 INFO MesosExecutorBackend: Using Spark's default log4j
profile: org/apache/spark/log4j-defaults.properties
14/06/04 19:55:26 INFO MesosExecutorBackend: Registered with Mesos as
executor ID 201406041941-3213994176-5050-21843-0
14/06/04 19:55:26 INFO Executor: Using REPL class URI:
http://192.168.145.191:48052
14/06/04 19:55:27 INFO Slf4jLogger: Slf4jLogger started
14/06/04 19:55:27 INFO Remoting: Starting remoting
14/06/04 19:55:27 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://spark@host-DSRV04.host.co.in:34024]
14/06/04 19:55:27 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://spark@host-DSRV04.host.co.in:34024]
14/06/04 19:55:27 INFO SparkEnv: Connecting to BlockManagerMaster:
akka.tcp://spark@host-DSRV01.host.CO.IN:52146/user/BlockManagerMaster
14/06/04 19:55:27 INFO DiskBlockManager: Created local directory at
/tmp/spark-local-20140604195527-72b8
14/06/04 19:55:27 INFO MemoryStore: MemoryStore started with capacity 294.9
MB.
14/06/04 19:55:27 INFO ConnectionManager: Bound socket to port 35687 with id
= ConnectionManagerId(host-DSRV04.host.co.in,35687)
14/06/04 19:55:27 INFO BlockManagerMaster: Trying to register BlockManager
14/06/04 19:55:27 INFO BlockManagerMaster: Registered BlockManager
14/06/04 19:55:27 INFO SparkEnv: Connecting to MapOutputTracker:
akka.tcp://spark@host-DSRV01.host.CO.IN:52146/user/MapOutputTracker
14/06/04 19:55:27 INFO HttpFileServer: HTTP File server directory is
/tmp/spark-45838575-980e-44dc-8d88-5acee7bb9981
14/06/04 19:55:27 INFO HttpServer: Starting HTTP Server
14/06/04 19:55:27 ERROR Executor: Uncaught exception in thread
Thread[Executor task launch worker-7,5,main]
java.lang.Error: java.io.IOException: failure to login
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: failure to login
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:40)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	... 2 more
Caused by: javax.security.auth.login.LoginException: unable to find
LoginModule class:
org/apache/hadoop/security/UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:822)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
	at java.security.AccessController.doPrivileged(Native Method)
	at
javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 6 more
14/06/04 19:55:27 ERROR Executor: Uncaught exception in thread
Thread[Executor task launch worker-6,5,main]
java.lang.Error: java.io.IOException: failure to login
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: failure to login
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:40)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	... 2 more
Caused by: javax.security.auth.login.LoginException: unable to find
LoginModule class:
org/apache/hadoop/security/UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:822)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
	at java.security.AccessController.doPrivileged(Native Method)
	at
javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 6 more
14/06/04 19:55:27 ERROR Executor: Uncaught exception in thread
Thread[Executor task launch worker-0,5,main]
java.lang.Error: java.io.IOException: failure to login
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: failure to login
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:40)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	... 2 more
Caused by: javax.security.auth.login.LoginException: unable to find
LoginModule class:
org/apache/hadoop/security/UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:822)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
	at java.security.AccessController.doPrivileged(Native Method)
	at
javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 6 more
14/06/04 19:55:27 ERROR Executor: Uncaught exception in thread
Thread[Executor task launch worker-2,5,main]
java.lang.Error: java.io.IOException: failure to login
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: failure to login
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:40)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	... 2 more
Caused by: javax.security.auth.login.LoginException: unable to find
LoginModule class:
org/apache/hadoop/security/UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:822)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
	at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
	at java.security.AccessController.doPrivileged(Native Method)
	at
javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
	at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 6 more

If you can suggest what is causing this error that would be great.

--
Thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p6927.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark not working with mesos

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
http://spark.apache.org/docs/latest/running-on-mesos.html#troubleshooting-and-debugging

​​
If you are not able to find the logs in /var/log/mesos

Do check in /tmp/mesos/  and you can see your applications id and all just
like in the $SPARK_HOME/work directory.



Thanks
Best Regards


On Wed, Jun 4, 2014 at 3:18 PM, praveshjain1991 <pr...@gmail.com>
wrote:

> Thanks for the reply Akhil.
> I created a tar.gz of created by make-distribution.sh which is accessible
> from all the slaves (I checked it using hadoop fs -ls /path/). Also there
> are no worker logs printed in $SPARK_HOME/work/ directory on the workers
> (which are otherwise printed if i run without using mesos).
>
> --
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p6900.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Spark not working with mesos

Posted by praveshjain1991 <pr...@gmail.com>.
Thanks for the reply Akhil.
I created a tar.gz of created by make-distribution.sh which is accessible
from all the slaves (I checked it using hadoop fs -ls /path/). Also there
are no worker logs printed in $SPARK_HOME/work/ directory on the workers
(which are otherwise printed if i run without using mesos).

--
Thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-not-working-with-mesos-tp6806p6900.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.