You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Manish Verma <m_...@apple.com> on 2016/01/13 23:51:20 UTC

Nutch 1.10 Multiple Threads

Hi,

I have defined 5 fetch threads  and min.crawl.delay set to 1 second . And let’s say there is only one fetchQueue and threads per queue is also set 5

My Confusion is

First all 5 threads will crawl queue and hit the server.

Then after 1 second (min crawl delay) threads which are done with previous fetch will pick items from queue and will hit server.
Thread which could not make for this round will have to wait for next round.

Now question is, lets say 3 threads are free they will hit server then lets say after 400 ms rest 2 get free they also have to wait till 1 second or just the leftover 600 ms ?

I see in code nextFetchTime is getting set when ever 1 fetch is done,  nextFetchTime is separate for each thread or it’s for queue ?



Thanks
Manish Verma
AML Search