You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by muraliweb <mu...@live.com> on 2009/09/03 10:29:07 UTC

Re: Nutch crawl does not capture pages of lower depth

Managed to find out the problem.
The property indexer.max.tokens in nutch-default.xml was causing the top
level pages to be skipped.
After changing the value to something like 30000, the crawler was able to
pick up all the pages as per the configured depth.



muraliweb wrote:
> 
> Nutch crawl does not pick up pages at depth 1 and 2 when its configured
> for depth 3.
> When the crawl is configured at depth 2 it does not pickup the homepage.
> Can anyone please help
> thanks in advance
> murali
> 

-- 
View this message in context: http://www.nabble.com/Nutch-crawl-does-not-capture-pages-of-lower-depth-tp25084017p25271774.html
Sent from the Nutch - User mailing list archive at Nabble.com.