You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Markus Jelsma <ma...@openindex.io> on 2011/12/23 15:19:06 UTC
Nutch unstable on 0.22.0
Hi,
In the past few weeks we evaluated and partially migrated from Hadoop
0.20.203.0 to 0.22.0. Most stuff works fine locally and simple jobs do well on
the cluster. However, the most essential part of Nutch, the fetcher, seems to
be very unstable on 0.22.0. In every crawl i can no be almost certain that at
least some mappers mysteriously freeze and eventually time out. Other mappers
are killed straight away or after a few minutes because of OOM errors. Memory
consumption is also a lot higher on 0.22.0.
Right now we have three clusters, an old 0.20.203 cluster and the unstable
0.22.0 and a 0.20.205 running on the same new cluster. When we run identical
jobs on all three clusters 0.22.0 almost always fails, eating RAM and
occasionally freezing a mapper. Stack traces of those mappers show all threads
are blocked and sometimes we see jstack unable to print deadlocks (null).
I tried many settings for 0.22.0 and very conservative settings for Nutch such
as few threads to spare resources (which are abundant actually) but i cannot
seem to find the issue. The fetcher job still uses the old mapred API.
I'd like to present a better issue report but i don't know what component in
all this mess is actually responsible. It looks like the tasktracker but i'm
unsure.
If anyone can point us in the right direction so we can find the issue and
assist in fixing it that would be great.
Thanks