You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/16 16:59:34 UTC

java.util.NoSuchElementException: key not found

Hi All,

I suspect I am experiencing a bug. I've noticed that while running
larger jobs, they occasionally die with the exception
"java.util.NoSuchElementException: key not found xyz", where "xyz"
denotes the ID of some particular task.  I've excerpted the log from
one job that died in this way below and attached the full log for
reference.

I suspect that my bug is the same as SPARK-2002 (linked below).  Is
there any reason to suspect otherwise?  Is there any known workaround
other than not coalescing?
https://issues.apache.org/jira/browse/SPARK-2002
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201406.mbox/%3CCAMwrk0=D1Dww5fdbtpKefwokYoZLTosbBjqAmSQqjowLzNgqbQ@mail.gmail.com%3E

Note that I have been coalescing SchemaRDDs using "srdd =
SchemaRDD(srdd._jschema_rdd.coalesce(partitions, False, None),
sqlCtx)", the workaround described in this thread.
http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3CCANR-kKciei17m43-yz5Z-pJ00ZwpW3Ka_U7ZhvE2Y7eJW1vBwg@mail.gmail.com%3E

...
14/09/15 21:43:14 INFO scheduler.TaskSetManager: Starting task 78.0 in
stage 551.0 (TID 78738, bennett.research.intel-research.net,
PROCESS_LOCAL, 1056 bytes)
...
14/09/15 21:43:15 INFO storage.BlockManagerInfo: Added
taskresult_78738 in memory on
bennett.research.intel-research.net:38074 (size: 13.0 MB, free: 1560.8
MB)
...
14/09/15 21:43:15 ERROR scheduler.TaskResultGetter: Exception while
getting task result
java.util.NoSuchElementException: key not found: 78738
        at scala.collection.MapLike$class.default(MapLike.scala:228)
        at scala.collection.AbstractMap.default(Map.scala:58)
        at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
        at org.apache.spark.scheduler.TaskSetManager.handleTaskGettingResult(TaskSetManager.scala:500)
        at org.apache.spark.scheduler.TaskSchedulerImpl.handleTaskGettingResult(TaskSchedulerImpl.scala:348)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:52)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:46)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)


I am running the pre-compiled 1.1.0 binaries.

best,
-Brad

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: java.util.NoSuchElementException: key not found

Posted by Brad Miller <bm...@eecs.berkeley.edu>.
See attached for full log.

On Tue, Sep 16, 2014 at 7:59 AM, Brad Miller <bm...@eecs.berkeley.edu> wrote:
> Hi All,
>
> I suspect I am experiencing a bug. I've noticed that while running
> larger jobs, they occasionally die with the exception
> "java.util.NoSuchElementException: key not found xyz", where "xyz"
> denotes the ID of some particular task.  I've excerpted the log from
> one job that died in this way below and attached the full log for
> reference.
>
> I suspect that my bug is the same as SPARK-2002 (linked below).  Is
> there any reason to suspect otherwise?  Is there any known workaround
> other than not coalescing?
> https://issues.apache.org/jira/browse/SPARK-2002
> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201406.mbox/%3CCAMwrk0=D1Dww5fdbtpKefwokYoZLTosbBjqAmSQqjowLzNgqbQ@mail.gmail.com%3E
>
> Note that I have been coalescing SchemaRDDs using "srdd =
> SchemaRDD(srdd._jschema_rdd.coalesce(partitions, False, None),
> sqlCtx)", the workaround described in this thread.
> http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3CCANR-kKciei17m43-yz5Z-pJ00ZwpW3Ka_U7ZhvE2Y7eJW1vBwg@mail.gmail.com%3E
>
> ...
> 14/09/15 21:43:14 INFO scheduler.TaskSetManager: Starting task 78.0 in
> stage 551.0 (TID 78738, bennett.research.intel-research.net,
> PROCESS_LOCAL, 1056 bytes)
> ...
> 14/09/15 21:43:15 INFO storage.BlockManagerInfo: Added
> taskresult_78738 in memory on
> bennett.research.intel-research.net:38074 (size: 13.0 MB, free: 1560.8
> MB)
> ...
> 14/09/15 21:43:15 ERROR scheduler.TaskResultGetter: Exception while
> getting task result
> java.util.NoSuchElementException: key not found: 78738
>         at scala.collection.MapLike$class.default(MapLike.scala:228)
>         at scala.collection.AbstractMap.default(Map.scala:58)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
>         at org.apache.spark.scheduler.TaskSetManager.handleTaskGettingResult(TaskSetManager.scala:500)
>         at org.apache.spark.scheduler.TaskSchedulerImpl.handleTaskGettingResult(TaskSchedulerImpl.scala:348)
>         at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:52)
>         at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
>         at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
>         at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)
>         at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:46)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:701)
>
>
> I am running the pre-compiled 1.1.0 binaries.
>
> best,
> -Brad