You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2014/11/25 12:38:12 UTC

[jira] [Resolved] (SPARK-2002) Race condition in accessing cache locations in DAGScheduler

     [ https://issues.apache.org/jira/browse/SPARK-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-2002.
------------------------------
    Resolution: Duplicate

This appears to be identical to SPARK-4454, and that one has an open PR.

> Race condition in accessing cache locations in DAGScheduler
> -----------------------------------------------------------
>
>                 Key: SPARK-2002
>                 URL: https://issues.apache.org/jira/browse/SPARK-2002
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.1
>            Reporter: Tathagata Das
>
> DAGScheduler stores, updates, and occasionally clears the cache locations of RDD partitions, in its own scheduling thread. PartitionCoalescer of CoalescedRDDs also access the cache locations (via getPreferredLocations) from a different thread. This leads to race conditions, and sporadic "key not found" errors.
>     java.util.NoSuchElementException: key not found: 32855
>         at scala.collection.MapLike$class.default(MapLike.scala:228)
>         at scala.collection.AbstractMap.default(Map.scala:58)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
>         at org.apache.spark.scheduler.DAGScheduler.getCacheLocs(DAGScheduler.scala:211)
>         at org.apache.spark.scheduler.DAGScheduler.getPreferredLocs(DAGScheduler.scala:1072)
>         at org.apache.spark.SparkContext.getPreferredLocs(SparkContext.scala:716)
>         at org.apache.spark.rdd.PartitionCoalescer.currPrefLocs(CoalescedRDD.scala:172)
>         at org.apache.spark.rdd.PartitionCoalescer$LocationIterator$$anonfun$4$$anonfun$apply$2.apply(CoalescedRDD.scala:189)
>         at org.apache.spark.rdd.PartitionCoalescer$LocationIterator$$anonfun$4$$anonfun$apply$2.apply(CoalescedRDD.scala:188)
>         at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:351)
>         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:350)
>         at org.apache.spark.rdd.PartitionCoalescer$LocationIterator.<init>(CoalescedRDD.scala:183)
>         at org.apache.spark.rdd.PartitionCoalescer.setupGroups(CoalescedRDD.scala:234)
>         at org.apache.spark.rdd.PartitionCoalescer.run(CoalescedRDD.scala:333)
>         at org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:81)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
>         at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
>         at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
>         at org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:31)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
>         at org.apache.spark.rdd.RDD.take(RDD.scala:830)
>         at org.apache.spark.api.java.JavaRDDLike$class.take(JavaRDDLike.scala:337)
>         at org.apache.spark.api.java.JavaRDD.take(JavaRDD.scala:27)
>         at com.tellapart.manifolds.spark.ManifoldsUtil$PersistToKafkaFunction.call(ManifoldsUtil.java:87)
>         at com.tellapart.manifolds.spark.ManifoldsUtil$PersistToKafkaFunction.call(ManifoldsUtil.java:53)
>         at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:270)
>         at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:270)
>         at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:520)
>         at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:520)
>         at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
>         at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>         at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>         at scala.util.Try$.apply(Try.scala:161)
>         at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
>         at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org