You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Brian Tingle <Br...@ucop.edu> on 2009/07/24 05:16:19 UTC
adding [-numFetchers numFetchers] to crawl
How do I set the number of Map tasks when I do a command like
hadoop jar nutch-1.0.job org.apache.nutch.crawler.Crawl
?
I think I'm going to try out the change below, is there any reason not
to do it, or is Crawl supposed to be more of a demo and I should write
some script or my own crawler class?
> diff Crawl.java.orig Crawl.java
53c53
< ("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i]
[-topN N]");
---
> ("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i]
[-topN N] [-numFetchers]");
65a66
> int numFetchers = -1;
78a80,82
> } else if ("-numFetchers".equals(args[i])) {
> numFetchers = Integer.parseInt(args[i+1]);
> i++;
116c120
< Path segment = generator.generate(crawlDb, segments, -1, topN,
System
---
> Path segment = generator.generate(crawlDb, segments,
numFetchers, topN, System