You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Karsten Dello <de...@mi.fu-berlin.de> on 2006/12/06 02:24:57 UTC
Problem with fetching
Hello,
I have a strange problem generating fetchlists,
maybe someone can point in the right direction?
I do a couple of inject/generate/fetch/update-cycles to crawl a
defined subgraph.
last cycle approx. 600000 docs should be fetched, but only 150000 are
actually fetched.
The last thing I see in the log file is
2006-12-04 14:24:14,221 INFO fetcher.Fetcher - fetch of
http://www.microbes.info/forums/index.php?s=7179874ada709ad4d9874517f2790ef0&
failed with: java.lang.NullPointerEx$
2006-12-04 14:24:14,238 FATAL fetcher.Fetcher - java.lang.NullPointerException
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:198)
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:189)
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$2.collect(MapTask.java:91)
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:314)
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:232)
2006-12-04 14:24:14,241 FATAL fetcher.Fetcher - fetcher
caught:java.lang.NullPointerException
Then nothing happens; approx. 10 minutes later map reduce comes up with
2006-12-04 14:33:06,169 INFO mapred.LocalJobRunner - reduce > sort
2006-12-04 14:33:07,586 INFO mapred.JobClient - map 100% reduce 33%