You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by makaveli91ro <ma...@yahoo.com> on 2012/08/29 17:22:02 UTC

Distributed Fetching

Hello all,

I am locking forward to create a distributed crawler. 
I know MapReduce is very good for indexing, but how about fetching? How can
I distribute the downloads in a best way over some nr of machine?

Does Nutch(+ Hadoop) has some facilities for distributed fetching?
Please provide some ideas, or some documentation.

Thank you,
Sergiu.



--
View this message in context: http://lucene.472066.n3.nabble.com/Distributed-Fetching-tp4004066.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Distributed Fetching

Posted by Lewis John Mcgibbney <le...@gmail.com>.

Please see the tutorial and search on the user lists (you can find
plenty of info on this via out website)

http://www.mail-archive.com/user%40nutch.apache.org/
http://wiki.apache.org/nutch/#Other_Tutorial.28s.29

On Wed, Aug 29, 2012 at 4:22 PM, makaveli91ro <ma...@yahoo.com> wrote:
> Hello all,
>
> I am locking forward to create a distributed crawler.
> I know MapReduce is very good for indexing, but how about fetching? How can
> I distribute the downloads in a best way over some nr of machine?
>
> Does Nutch(+ Hadoop) has some facilities for distributed fetching?
> Please provide some ideas, or some documentation.
>
> Thank you,
> Sergiu.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Distributed-Fetching-tp4004066.html
> Sent from the Nutch - User mailing list archive at Nabble.com.



-- 
Lewis