You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hiten Patel (JIRA)" <ji...@apache.org> on 2016/03/17 02:30:33 UTC

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

    [ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198556#comment-15198556 ] 

Hiten Patel commented on SPARK-5594:
------------------------------------

Yes, this was indeed the problem in my case too. 

I had a custom function all in the same Scala object (Spark Streaming job) with input (Kafka direct streaming) and output (to Cassandra). 

After moving the function to a seperate object that extends Serializable, it did work finally !!. A BIG thanks to Sal Uryasev for reporting the solution. 

It does make sense after realizing the proper solution on how indeed it should work in Spark. However, it's really vexing to nail down such issues since the stacktraces are too generic and could mean lot of things. I think this is more of a coding practice and needs to be documented in the official documentation as  one of the coding practices. It''s very easy to get trapped into such issues for anyone.

Let me know the right place where this needs to go in documentation and I can send a PR. 



> SparkException: Failed to get broadcast (TorrentBroadcast)
> ----------------------------------------------------------
>
>                 Key: SPARK-5594
>                 URL: https://issues.apache.org/jira/browse/SPARK-5594
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0, 1.3.0
>            Reporter: John Sandiford
>            Priority: Critical
>
> I am uncertain whether this is a bug, however I am getting the error below when running on a cluster (works locally), and have no idea what is causing it, or where to look for more information.
> Any help is appreciated.  Others appear to experience the same issue, but I have not found any solutions online.
> Please note that this only happens with certain code and is repeatable, all my other spark jobs work fine.
> {noformat}
> ERROR TaskSetManager: Task 3 in stage 6.0 failed 4 times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 6.0 failed 4 times, most recent failure: Lost task 3.3 in stage 6.0 (TID 24, <removed>): java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_6_piece0 of broadcast_6
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1011)
>         at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>         at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>         at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>         at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.spark.SparkException: Failed to get broadcast_6_piece0 of broadcast_6
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1008)
>         ... 11 more
> {noformat}
> Driver stacktrace:
> {noformat}
>         at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
>         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>         at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>         at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org