You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zhang, Lisheng" <Li...@broadvision.com> on 2004/09/29 19:38:26 UTC

Free software to crawl internet site?

Hi,

Does anyone know if there is free-software to crawl internet site
(webcrawler)? I know currently lucene does not have this feature
according to official lucene FAQ.

Thanks very much for helps, 

Lisheng

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Free software to crawl internet site?

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
Nutch has a crawler. So does Egothor (the crawler is called Capek). If 
you type "web crawler" in Google you'll get tons of projects.

Dawid


Zhang, Lisheng wrote:

> Hi,
> 
> Does anyone know if there is free-software to crawl internet site
> (webcrawler)? I know currently lucene does not have this feature
> according to official lucene FAQ.
> 
> Thanks very much for helps, 
> 
> Lisheng
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Free software to crawl internet site?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
LARM is no longer being maintained, unfortunately.  Use Nutch -
http://www.nutch.org/

Otis

--- Bo Gundersen <bg...@atira.dk> wrote:

> Zhang, Lisheng wrote:
> 
> > Hi,
> > 
> > Does anyone know if there is free-software to crawl internet site
> > (webcrawler)? I know currently lucene does not have this feature
> > according to official lucene FAQ.
> > 
> > Thanks very much for helps, 
> 
> In the lucene sandbox there is a pretty advanced crawler called LARM,
> 
> you can check it out at this URL 
> http://jakarta.apache.org/lucene/docs/lucene-sandbox/ (right at the
> bottom).
> 
> -- 
> Bo Gundersen
> DBA/Software Developer
> M.Sc.CS.
> www.atira.dk
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Free software to crawl internet site?

Posted by Bo Gundersen <bg...@atira.dk>.
Zhang, Lisheng wrote:

> Hi,
> 
> Does anyone know if there is free-software to crawl internet site
> (webcrawler)? I know currently lucene does not have this feature
> according to official lucene FAQ.
> 
> Thanks very much for helps, 

In the lucene sandbox there is a pretty advanced crawler called LARM, 
you can check it out at this URL 
http://jakarta.apache.org/lucene/docs/lucene-sandbox/ (right at the bottom).

-- 
Bo Gundersen
DBA/Software Developer
M.Sc.CS.
www.atira.dk

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org