You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/09/16 14:07:23 UTC

Spark Streaming: CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID

I have a standalone spark cluster and from within the same scala
application I'm creating 2 different spark context to run two different
spark streaming jobs as SparkConf is different for each of them.

I'm getting this error that... I don't really understand:

14/09/16 11:51:35 ERROR OneForOneStrategy: spark.httpBroadcast.uri
> java.util.NoSuchElementException: spark.httpBroadcast.uri
> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
> 	at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
> 	at scala.collection.AbstractMap.getOrElse(Map.scala:58)
> 	at org.apache.spark.SparkConf.get(SparkConf.scala:149)
> 	at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:130)
> 	at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcastFactory.scala:31)
> 	at org.apache.spark.broadcast.BroadcastManager.initialize(BroadcastManager.scala:48)
> 	at org.apache.spark.broadcast.BroadcastManager.<init>(BroadcastManager.scala:35)
> 	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:218)
> 	at org.apache.spark.executor.Executor.<init>(Executor.scala:85)
> 	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:59)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 14/09/16 11:51:35 ERROR CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID: 10
>
>
 Both apps has different names so I don't get how the executor ID is not
unique :-S

This happens on startup and most of the time only one job dies, but
sometimes both of them die.

Regards,

Luis

Re: Spark Streaming: CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID

Posted by Luis Ángel Vicente Sánchez <la...@gmail.com>.
I dug a bit more and the executor ID is a number so it's seems there is not
possible workaround.

Looking at the code of the CoarseGrainedSchedulerBackend.scala:

https://github.com/apache/spark/blob/6324eb7b5b0ae005cb2e913e36b1508bd6f1b9b8/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L86

It seems there is only one DriverActor and, as the RegisterExecutor message
only contains the executorID, there is a collision.

I wonder if what I'm doing is wrong... basically, from the same scala
application I...

1. create one actor per job.
2. send a message to each actor with configuration details to create a
SparkContext/StreamingContext.
3. send a message to each actor to start the job and streaming context.

2014-09-16 13:29 GMT+01:00 Luis Ángel Vicente Sánchez <
langel.groups@gmail.com>:

> When I said scheduler I meant executor backend.
>
> 2014-09-16 13:26 GMT+01:00 Luis Ángel Vicente Sánchez <
> langel.groups@gmail.com>:
>
>> It seems that, as I have a single scala application, the scheduler is the
>> same and there is a collision between executors of both spark context. Is
>> there a way to change how the executor ID is generated (maybe an uuid
>> instead of a sequential number..?)
>>
>> 2014-09-16 13:07 GMT+01:00 Luis Ángel Vicente Sánchez <
>> langel.groups@gmail.com>:
>>
>>> I have a standalone spark cluster and from within the same scala
>>> application I'm creating 2 different spark context to run two different
>>> spark streaming jobs as SparkConf is different for each of them.
>>>
>>> I'm getting this error that... I don't really understand:
>>>
>>> 14/09/16 11:51:35 ERROR OneForOneStrategy: spark.httpBroadcast.uri
>>>> java.util.NoSuchElementException: spark.httpBroadcast.uri
>>>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>>>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>>>> 	at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
>>>> 	at scala.collection.AbstractMap.getOrElse(Map.scala:58)
>>>> 	at org.apache.spark.SparkConf.get(SparkConf.scala:149)
>>>> 	at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:130)
>>>> 	at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcastFactory.scala:31)
>>>> 	at org.apache.spark.broadcast.BroadcastManager.initialize(BroadcastManager.scala:48)
>>>> 	at org.apache.spark.broadcast.BroadcastManager.<init>(BroadcastManager.scala:35)
>>>> 	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:218)
>>>> 	at org.apache.spark.executor.Executor.<init>(Executor.scala:85)
>>>> 	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:59)
>>>> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>>> 	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>>> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>>> 	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>>> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>>> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>> 14/09/16 11:51:35 ERROR CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID: 10
>>>>
>>>>
>>>  Both apps has different names so I don't get how the executor ID is not
>>> unique :-S
>>>
>>> This happens on startup and most of the time only one job dies, but
>>> sometimes both of them die.
>>>
>>> Regards,
>>>
>>> Luis
>>>
>>
>>
>

Re: Spark Streaming: CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID

Posted by Luis Ángel Vicente Sánchez <la...@gmail.com>.
When I said scheduler I meant executor backend.

2014-09-16 13:26 GMT+01:00 Luis Ángel Vicente Sánchez <
langel.groups@gmail.com>:

> It seems that, as I have a single scala application, the scheduler is the
> same and there is a collision between executors of both spark context. Is
> there a way to change how the executor ID is generated (maybe an uuid
> instead of a sequential number..?)
>
> 2014-09-16 13:07 GMT+01:00 Luis Ángel Vicente Sánchez <
> langel.groups@gmail.com>:
>
>> I have a standalone spark cluster and from within the same scala
>> application I'm creating 2 different spark context to run two different
>> spark streaming jobs as SparkConf is different for each of them.
>>
>> I'm getting this error that... I don't really understand:
>>
>> 14/09/16 11:51:35 ERROR OneForOneStrategy: spark.httpBroadcast.uri
>>> java.util.NoSuchElementException: spark.httpBroadcast.uri
>>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>>> 	at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
>>> 	at scala.collection.AbstractMap.getOrElse(Map.scala:58)
>>> 	at org.apache.spark.SparkConf.get(SparkConf.scala:149)
>>> 	at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:130)
>>> 	at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcastFactory.scala:31)
>>> 	at org.apache.spark.broadcast.BroadcastManager.initialize(BroadcastManager.scala:48)
>>> 	at org.apache.spark.broadcast.BroadcastManager.<init>(BroadcastManager.scala:35)
>>> 	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:218)
>>> 	at org.apache.spark.executor.Executor.<init>(Executor.scala:85)
>>> 	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:59)
>>> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>> 	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>> 	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>> 14/09/16 11:51:35 ERROR CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID: 10
>>>
>>>
>>  Both apps has different names so I don't get how the executor ID is not
>> unique :-S
>>
>> This happens on startup and most of the time only one job dies, but
>> sometimes both of them die.
>>
>> Regards,
>>
>> Luis
>>
>
>

Re: Spark Streaming: CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID

Posted by Luis Ángel Vicente Sánchez <la...@gmail.com>.
It seems that, as I have a single scala application, the scheduler is the
same and there is a collision between executors of both spark context. Is
there a way to change how the executor ID is generated (maybe an uuid
instead of a sequential number..?)

2014-09-16 13:07 GMT+01:00 Luis Ángel Vicente Sánchez <
langel.groups@gmail.com>:

> I have a standalone spark cluster and from within the same scala
> application I'm creating 2 different spark context to run two different
> spark streaming jobs as SparkConf is different for each of them.
>
> I'm getting this error that... I don't really understand:
>
> 14/09/16 11:51:35 ERROR OneForOneStrategy: spark.httpBroadcast.uri
>> java.util.NoSuchElementException: spark.httpBroadcast.uri
>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>> 	at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149)
>> 	at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
>> 	at scala.collection.AbstractMap.getOrElse(Map.scala:58)
>> 	at org.apache.spark.SparkConf.get(SparkConf.scala:149)
>> 	at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:130)
>> 	at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcastFactory.scala:31)
>> 	at org.apache.spark.broadcast.BroadcastManager.initialize(BroadcastManager.scala:48)
>> 	at org.apache.spark.broadcast.BroadcastManager.<init>(BroadcastManager.scala:35)
>> 	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:218)
>> 	at org.apache.spark.executor.Executor.<init>(Executor.scala:85)
>> 	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:59)
>> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> 	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> 	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> 14/09/16 11:51:35 ERROR CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID: 10
>>
>>
>  Both apps has different names so I don't get how the executor ID is not
> unique :-S
>
> This happens on startup and most of the time only one job dies, but
> sometimes both of them die.
>
> Regards,
>
> Luis
>