You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ngoc Giang Nguyen <gi...@gmail.com> on 2005/10/12 05:28:11 UTC
Unlimited access to a web server for Nutch
Hi all,
I'm running Nutch to crawl some specific websites that I know the web admins
personally. So is there anyway to change the settings of the target web
servers such that they give my Nutch higher priority, let's say unlimited
access, assuming they are all Apache servers? Because usually I observed
that Nutch has a lot of "HTTP max delay" even when I set the timeout quite
large and the network connections are quite perfect (I also double check by
visiting those websites by browser, and they respond well).
All suggestions are welcomed. Thanks a lot.
Best regards,
Giang
Re: Unlimited access to a web server for Nutch
Posted by Doug Cutting <cu...@nutch.org>.
Ngoc Giang Nguyen wrote:
> I'm running Nutch to crawl some specific websites that I know the web admins
> personally. So is there anyway to change the settings of the target web
> servers such that they give my Nutch higher priority, let's say unlimited
> access, assuming they are all Apache servers? Because usually I observed
> that Nutch has a lot of "HTTP max delay" even when I set the timeout quite
> large and the network connections are quite perfect (I also double check by
> visiting those websites by browser, and they respond well).
Try something like fetcher.server.delay=0 and fetcher.threads.per.host=10.
Doug