You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "EE (JIRA)" <ji...@apache.org> on 2017/01/25 14:08:26 UTC
[jira] [Comment Edited] (SPARK-13407)
TaskMetrics.fromAccumulatorUpdates can crash when trying to access
garbage-collected accumulators
[ https://issues.apache.org/jira/browse/SPARK-13407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837777#comment-15837777 ]
EE edited comment on SPARK-13407 at 1/25/17 2:07 PM:
-----------------------------------------------------
We recreated this issue on spark 1.6.2 as well.
Please find below some details and the error:
1. We are using spark-streaming application
2. When we update accumulators we got the following error
3. It's a critical bug in our case, as it crashes the streaming, and we cannot encapsulate it.
Exception in thread "dag-scheduler-event-loop" java.lang.IllegalAccessError: Attempted to access garbage collected Accumulator.
at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:353)
at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:346)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.Accumulators$.add(Accumulators.scala:346)
at org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1081)
at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1153)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1639)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1601)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1590)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
I'll be glad if you could implement the fix in v1.6.2 as well
was (Author: ee1):
We recreated this case on spark 1.6.2 as well.
on spark-streaming application that updates accumulators we got the following error:
Exception in thread "dag-scheduler-event-loop" java.lang.IllegalAccessError: Attempted to access garbage collected Accumulator.
at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:353)
at org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:346)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.Accumulators$.add(Accumulators.scala:346)
at org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1081)
at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1153)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1639)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1601)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1590)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> TaskMetrics.fromAccumulatorUpdates can crash when trying to access garbage-collected accumulators
> -------------------------------------------------------------------------------------------------
>
> Key: SPARK-13407
> URL: https://issues.apache.org/jira/browse/SPARK-13407
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.0.0
> Reporter: Josh Rosen
> Assignee: Josh Rosen
> Fix For: 2.0.0
>
>
> TaskMetrics.fromAccumulatorUpdates can fail if accumulators have been garbage-collected:
> {code}
> java.lang.IllegalAccessError: Attempted to access garbage collected accumulator 481596
> at org.apache.spark.Accumulators$$anonfun$get$1$$anonfun$apply$1.apply(Accumulator.scala:133)
> at org.apache.spark.Accumulators$$anonfun$get$1$$anonfun$apply$1.apply(Accumulator.scala:133)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.Accumulators$$anonfun$get$1.apply(Accumulator.scala:132)
> at org.apache.spark.Accumulators$$anonfun$get$1.apply(Accumulator.scala:130)
> at scala.Option.map(Option.scala:145)
> at org.apache.spark.Accumulators$.get(Accumulator.scala:130)
> at org.apache.spark.executor.TaskMetrics$$anonfun$9.apply(TaskMetrics.scala:414)
> at org.apache.spark.executor.TaskMetrics$$anonfun$9.apply(TaskMetrics.scala:412)
> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
> at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
> at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
> at org.apache.spark.executor.TaskMetrics$.fromAccumulatorUpdates(TaskMetrics.scala:412)
> at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2.apply(JobProgressListener.scala:499)
> at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2.apply(JobProgressListener.scala:493)
> at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
> at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
> at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
> at org.apache.spark.ui.jobs.JobProgressListener.onExecutorMetricsUpdate(JobProgressListener.scala:493)
> at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:56)
> at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:35)
> at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:35)
> at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
> at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:35)
> at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:81)
> at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
> at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65)
> at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1178)
> at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64)
> {code}
> In order to guard against this, we can eliminate the need to access driver-side accumulators when constructing TaskMetrics.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org