You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "EE (JIRA)" <ji...@apache.org> on 2017/01/25 14:04:26 UTC

[jira] [Commented] (SPARK-13407) TaskMetrics.fromAccumulatorUpdates can crash when trying to access garbage-collected accumulators

    [ https://issues.apache.org/jira/browse/SPARK-13407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837777#comment-15837777 ] 

EE commented on SPARK-13407:
----------------------------

We recreated this case on spark 1.6.2 as well.
on spark-streaming application that updates accumulators we got the following error:

Exception in thread "dag-scheduler-event-loop" java.lang.IllegalAccessError: Attempted to access garbage collected Accumulator.
      at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:353)
      at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:346)
      at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
      at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
      at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
      at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
      at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
      at org.apache.spark.Accumulators$.add(Accumulators.scala:346)
      at org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1081)
      at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1153)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1639)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1601)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1590)
      at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)


> TaskMetrics.fromAccumulatorUpdates can crash when trying to access garbage-collected accumulators
> -------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13407
>                 URL: https://issues.apache.org/jira/browse/SPARK-13407
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>             Fix For: 2.0.0
>
>
> TaskMetrics.fromAccumulatorUpdates can fail if accumulators have been garbage-collected:
> {code}
> java.lang.IllegalAccessError: Attempted to access garbage collected accumulator 481596
> 	at org.apache.spark.Accumulators$$anonfun$get$1$$anonfun$apply$1.apply(Accumulator.scala:133)
> 	at org.apache.spark.Accumulators$$anonfun$get$1$$anonfun$apply$1.apply(Accumulator.scala:133)
> 	at scala.Option.getOrElse(Option.scala:120)
> 	at org.apache.spark.Accumulators$$anonfun$get$1.apply(Accumulator.scala:132)
> 	at org.apache.spark.Accumulators$$anonfun$get$1.apply(Accumulator.scala:130)
> 	at scala.Option.map(Option.scala:145)
> 	at org.apache.spark.Accumulators$.get(Accumulator.scala:130)
> 	at org.apache.spark.executor.TaskMetrics$$anonfun$9.apply(TaskMetrics.scala:414)
> 	at org.apache.spark.executor.TaskMetrics$$anonfun$9.apply(TaskMetrics.scala:412)
> 	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
> 	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
> 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> 	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
> 	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
> 	at org.apache.spark.executor.TaskMetrics$.fromAccumulatorUpdates(TaskMetrics.scala:412)
> 	at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2.apply(JobProgressListener.scala:499)
> 	at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2.apply(JobProgressListener.scala:493)
> 	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
> 	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> 	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
> 	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
> 	at org.apache.spark.ui.jobs.JobProgressListener.onExecutorMetricsUpdate(JobProgressListener.scala:493)
> 	at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:56)
> 	at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:35)
> 	at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:35)
> 	at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
> 	at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:35)
> 	at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:81)
> 	at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
> 	at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
> 	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> 	at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65)
> 	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1178)
> 	at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64)
> {code}
> In order to guard against this, we can eliminate the need to access driver-side accumulators when constructing TaskMetrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org