You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mohammad Monirul Hoque <im...@yahoo.com> on 2008/09/09 13:18:50 UTC

Is it possible to add new urls while nutch crawler is still running?

Hi,

Is there any possible way so that i can add new urls for crawling though the nutch crawler is still running??

Thanks in advance.

--monirul



      

Re: Is it possible to add new urls while nutch crawler is still running?

Posted by Dennis Kubes <ku...@apache.org>.
No.  The urls are pulled from the crawldb to create segments to crawl. 
Urls can be added to the crawldb while the fetcher is running and you 
can pull new segments to crawl while the fetcher is running.

There are some options on the generate tool for whether a generated 
segment prevents those same urls from being regenerated from the crawldb 
before those urls are updated in the crawldb via the updatedb tool.

Dennis


Mohammad Monirul Hoque wrote:
> Hi,
> 
> Is there any possible way so that i can add new urls for crawling though the nutch crawler is still running??
> 
> Thanks in advance.
> 
> --monirul
> 
> 
> 
>