You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by pranesh <pr...@hcl.in> on 2009/04/03 06:45:53 UTC

Re: Dedup: Job Failed and crawl stopped at depth 1

While crawling i got this error, how to solve it?

Injector: starting
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawl/segments/20090403101008
Generator: filtering: false
Generator: topN: 10
Generator: jobtracker is 'local', generating exactly one partition.
Generator: Partitioning selected urls by host, for politeness.
Generator: done.
Fetcher: starting
Fetcher: segment: crawl/segments/20090403101008
Fetcher: threads: 10
fetching http://www.hcltech.com/
Fetcher: done
CrawlDb update: starting
CrawlDb update: db: crawl/crawldb
CrawlDb update: segments: [crawl/segments/20090403101008]
CrawlDb update: additions allowed: true
CrawlDb update: URL normalizing: true
CrawlDb update: URL filtering: true
CrawlDb update: Merging segment data into db.
CrawlDb update: done
LinkDb: starting
LinkDb: linkdb: crawl/linkdb
LinkDb: URL normalize: true
LinkDb: URL filter: true
LinkDb: adding segment: crawl/segments/20090403101008
LinkDb: done
Indexer: starting
Indexer: linkdb: crawl/linkdb
Indexer: adding segment: crawl/segments/20090403101008
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
	at org.apache.nutch.indexer.Indexer.index(Indexer.java:273)
	at test.Crawl.doCrawl(Crawl.java:143)
	at test.Crawl.main(Crawl.java:50)


-- 
View this message in context: http://www.nabble.com/Dedup%3A-Job-Failed-and-crawl-stopped-at-depth-1-tp15176806p22861787.html
Sent from the Nutch - User mailing list archive at Nabble.com.