You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sandeep Tata <sa...@gmail.com> on 2008/01/25 21:01:02 UTC

generate.max.per.host on multiple nodes

Hello

I'm running nutch-0.9 on a 3 node hadoop cluster. The generator seems to be
ignoring the "generate.max.per.host" parameter that I set in
nutch-site.xml(I set it to 2000). When I run the generator in local
mode, on a single
node, it actually does limit the number of urls per host to 2000. However,
in distributed mode, when I check the fetchlist generated, there are hosts
with way more than 2000 urls.

Does anyone have pointers on what might be causing this? I looked in the log
files, and there seems to be nothing relevant.

Any help is greatly appreciated.

Thanks,
Sandeep