You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dominique Bejean <do...@eolya.fr> on 2013/05/22 15:11:26 UTC

Re: Crawl Anywhere -

Hi,

I didn't see this question.

Yes, I confirm Crawl-Anywhere can crawl in distributed environment.
If you have several huge web sites to crawl, you can dispatch crawling 
across several crawler engines. However, one single web site can only be 
crawled by one crawler engine at a time.
This limitation should be removed in future version.

For your information, new version 4.0.0 is now available as an 
open-source project hosted on Github - 
https://github.com/bejean/crawl-anywhere

Regards.




Le 11/02/13 12:02, O. Klein a écrit :
> Yes you can run CA on different machines.
>
> In "Manage" you have to set target and engine for this to work.
>
> I've never done this, so you have to contact the developer for more details.
>
>
>
> SivaKarthik wrote
>> Hi All,
>>   in our project, we need to download around millions of pages...
>>   so is there any support to do the crawling in distributed environment
>> using crawl-anywhere apps?
>>    or wat could be the alternatives...?
>>
>>   Thanks in advance..
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607831p4039674.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

-- 
Dominique Béjean
+33 6 08 46 12 43
skype: dbejean
www.eolya.fr
www.crawl-anywhere.com
www.mysolrserver.com