You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Joseph Naegele <jn...@grierforensics.com> on 2016/05/16 18:40:17 UTC
pros/cons of many nodes
Hi folks,
Would anyone be willing to share a few pros/cons of using many nodes vs. 1
very powerful machine for large-scale crawling? Of course many advantages
and disadvantages overlap with Hadoop and distributed computing in general,
but what I'm actually looking for are good reasons not to use a single
machine for Nutch.
One example could be that more machines give you more IP addresses for
fetching, and therefore less opportunity for being blocked by web admins,
correct?
Joe