You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/16 16:59:34 UTC
java.util.NoSuchElementException: key not found
Hi All,
I suspect I am experiencing a bug. I've noticed that while running
larger jobs, they occasionally die with the exception
"java.util.NoSuchElementException: key not found xyz", where "xyz"
denotes the ID of some particular task. I've excerpted the log from
one job that died in this way below and attached the full log for
reference.
I suspect that my bug is the same as SPARK-2002 (linked below). Is
there any reason to suspect otherwise? Is there any known workaround
other than not coalescing?
https://issues.apache.org/jira/browse/SPARK-2002
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201406.mbox/%3CCAMwrk0=D1Dww5fdbtpKefwokYoZLTosbBjqAmSQqjowLzNgqbQ@mail.gmail.com%3E
Note that I have been coalescing SchemaRDDs using "srdd =
SchemaRDD(srdd._jschema_rdd.coalesce(partitions, False, None),
sqlCtx)", the workaround described in this thread.
http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3CCANR-kKciei17m43-yz5Z-pJ00ZwpW3Ka_U7ZhvE2Y7eJW1vBwg@mail.gmail.com%3E
...
14/09/15 21:43:14 INFO scheduler.TaskSetManager: Starting task 78.0 in
stage 551.0 (TID 78738, bennett.research.intel-research.net,
PROCESS_LOCAL, 1056 bytes)
...
14/09/15 21:43:15 INFO storage.BlockManagerInfo: Added
taskresult_78738 in memory on
bennett.research.intel-research.net:38074 (size: 13.0 MB, free: 1560.8
MB)
...
14/09/15 21:43:15 ERROR scheduler.TaskResultGetter: Exception while
getting task result
java.util.NoSuchElementException: key not found: 78738
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.scheduler.TaskSetManager.handleTaskGettingResult(TaskSetManager.scala:500)
at org.apache.spark.scheduler.TaskSchedulerImpl.handleTaskGettingResult(TaskSchedulerImpl.scala:348)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:52)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:46)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
I am running the pre-compiled 1.1.0 binaries.
best,
-Brad
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: java.util.NoSuchElementException: key not found
Posted by Brad Miller <bm...@eecs.berkeley.edu>.
See attached for full log.
On Tue, Sep 16, 2014 at 7:59 AM, Brad Miller <bm...@eecs.berkeley.edu> wrote:
> Hi All,
>
> I suspect I am experiencing a bug. I've noticed that while running
> larger jobs, they occasionally die with the exception
> "java.util.NoSuchElementException: key not found xyz", where "xyz"
> denotes the ID of some particular task. I've excerpted the log from
> one job that died in this way below and attached the full log for
> reference.
>
> I suspect that my bug is the same as SPARK-2002 (linked below). Is
> there any reason to suspect otherwise? Is there any known workaround
> other than not coalescing?
> https://issues.apache.org/jira/browse/SPARK-2002
> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201406.mbox/%3CCAMwrk0=D1Dww5fdbtpKefwokYoZLTosbBjqAmSQqjowLzNgqbQ@mail.gmail.com%3E
>
> Note that I have been coalescing SchemaRDDs using "srdd =
> SchemaRDD(srdd._jschema_rdd.coalesce(partitions, False, None),
> sqlCtx)", the workaround described in this thread.
> http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3CCANR-kKciei17m43-yz5Z-pJ00ZwpW3Ka_U7ZhvE2Y7eJW1vBwg@mail.gmail.com%3E
>
> ...
> 14/09/15 21:43:14 INFO scheduler.TaskSetManager: Starting task 78.0 in
> stage 551.0 (TID 78738, bennett.research.intel-research.net,
> PROCESS_LOCAL, 1056 bytes)
> ...
> 14/09/15 21:43:15 INFO storage.BlockManagerInfo: Added
> taskresult_78738 in memory on
> bennett.research.intel-research.net:38074 (size: 13.0 MB, free: 1560.8
> MB)
> ...
> 14/09/15 21:43:15 ERROR scheduler.TaskResultGetter: Exception while
> getting task result
> java.util.NoSuchElementException: key not found: 78738
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:58)
> at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
> at org.apache.spark.scheduler.TaskSetManager.handleTaskGettingResult(TaskSetManager.scala:500)
> at org.apache.spark.scheduler.TaskSchedulerImpl.handleTaskGettingResult(TaskSchedulerImpl.scala:348)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:52)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:46)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
>
>
> I am running the pre-compiled 1.1.0 binaries.
>
> best,
> -Brad