You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by karthik085 <ka...@gmail.com> on 2007/11/07 21:17:46 UTC

Re: How to limit nutch to fetch, refetch and index just the injected URLs?

I had this same problem - just used depth 1 to fetch & index the injected
urls. As far as refetching goes - you might have use "updatedb" with
"-noAdditions". Another solution (not that efficient, but works) - restart
the crawl process with the same injected urls and discard the old
dir/segment.


Nicolás Lichtmaier wrote:
> 
> I'd like to limit nutch to fetch, refetch and index just the injected 
> URLs. Will setting db.max.outlinks.per.page to 0 enable me to do that? 
> If not... how could achive what I'm looking to?
> 
> Thanks!
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-limit-nutch-to-fetch%2C-refetch-and-index-just-the-injected-URLs--tf3125440.html#a13635185
Sent from the Nutch - User mailing list archive at Nabble.com.