You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ngoc Giang Nguyen <gi...@gmail.com> on 2005/10/12 05:28:11 UTC

Unlimited access to a web server for Nutch

 Hi all,

I'm running Nutch to crawl some specific websites that I know the web admins
personally. So is there anyway to change the settings of the target web
servers such that they give my Nutch higher priority, let's say unlimited
access, assuming they are all Apache servers? Because usually I observed
that Nutch has a lot of "HTTP max delay" even when I set the timeout quite
large and the network connections are quite perfect (I also double check by
visiting those websites by browser, and they respond well).

All suggestions are welcomed. Thanks a lot.

Best regards,
Giang

Re: Unlimited access to a web server for Nutch

Posted by Doug Cutting <cu...@nutch.org>.
Ngoc Giang Nguyen wrote:
> I'm running Nutch to crawl some specific websites that I know the web admins
> personally. So is there anyway to change the settings of the target web
> servers such that they give my Nutch higher priority, let's say unlimited
> access, assuming they are all Apache servers? Because usually I observed
> that Nutch has a lot of "HTTP max delay" even when I set the timeout quite
> large and the network connections are quite perfect (I also double check by
> visiting those websites by browser, and they respond well).

Try something like fetcher.server.delay=0 and fetcher.threads.per.host=10.

Doug